Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@apache.org Received: (qmail 65186 invoked from network); 14 Apr 2003 06:25:03 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 14 Apr 2003 06:25:03 -0000 Received: (qmail 23086 invoked by uid 97); 14 Apr 2003 06:27:08 -0000 Delivered-To: qmlist-jakarta-archive-lucene-user@nagoya.betaversion.org Received: (qmail 23079 invoked from network); 14 Apr 2003 06:27:06 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 14 Apr 2003 06:27:06 -0000 Received: (qmail 64623 invoked by uid 500); 14 Apr 2003 06:24:56 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 64577 invoked from network); 14 Apr 2003 06:24:56 -0000 Received: from 251.017.dsl.syd.iprimus.net.au (HELO file1.syd.nuix.com.au) (210.50.55.251) by daedalus.apache.org with SMTP; 14 Apr 2003 06:24:56 -0000 Received: from host86.syd.nuix.com.au (host86.syd.nuix.com.au [192.168.222.86]) by file1.syd.nuix.com.au (Postfix) with ESMTP id 8AD1D1069E5 for ; Mon, 14 Apr 2003 16:27:56 +1000 (EST) Content-Type: text/plain; charset="iso-8859-1" From: Victor Hadianto Organization: NUIX Pty. Ltd. To: "Lucene Users List" Subject: Re: QueryParser with stop/key words inside quotes Date: Mon, 14 Apr 2003 17:12:25 +1100 User-Agent: KMail/1.4.3 References: <20030414030820.28779.qmail@web12701.mail.yahoo.com> In-Reply-To: <20030414030820.28779.qmail@web12701.mail.yahoo.com> Massage-Id: <13921192.1322@nuix.com.au> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <200304141612.25989@bah> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N > The place to look is QueryParser.jj, method getFieldQuery, but it looks I've been looking at QueryParser.jj and does the following modification: I modified QueryParser to take 2 analyzer, 1 is the normal analyzer that drops all the stop words from the query, and the second analyzer will not drop any word from the token. And in QueryParser.jj I modified the following: | term= [ slop= ] [ boost= ] { // If quoteAnalyzer is not null use the quoteAnalyzer if (quoteAnalyzer == null) { q = getFieldQuery(field, analyzer, term.image.substring(1, term.image.length()-1)); } else { q = getFieldQuery(field, quoteAnalyzer, term.image.substring(1, term.image.length()-1)); } > However, would you even want to do something like that? > If you use the same Analyzer, with the same list of stop words for both Yes again the drawback is that I have to use the analyzer that does not drop all those words from the search and thus they are indexed. This will grow our index to probably a huge amount, but unfortunately this is our requirement that we need to be able to search something like "apple and orange" or "apple for tomato" > Otis Thanks for the reply. victor > --- Victor Hadianto wrote: > > Lucene's QueryParsers seems to drop stop/key words even if they are > > enclosed > > in double quotes. > > > > For example: > > > > apple for tomato > > --> +apple +tomato > > > > Which is what I expected, however > > > > "apple for tomato" > > --> "apple tomato" > > > > and "for" in between apple and tomato is conveniently dropped. > > > > Is there a way to tell QueryParser not to drop those words if they > > are > > enclosed in double quotes? > > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org