lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: QueryParser with stop/key words inside quotes
Date Mon, 14 Apr 2003 13:10:26 GMT
But if you use the same Analyzer with the same stop words for both
indexing and searching, then queries like "apples and oranges" _will_
find matches, even if "and" is included in the stop word list.
This will work because "and" will be dropped from both indexed text
_and_ from the query string.
However, "apples and oranges" will also find documents that contain
"apples XXX oranges", where XXX is any of your stop words.

As for your query, I don't know, try it and see.  The computer will
tell you :)

Otis


--- Victor Hadianto <victorh@nuix.com.au> wrote:
> > The place to look is QueryParser.jj, method getFieldQuery, but it
> looks
> 
> I've been looking at QueryParser.jj and does the following
> modification:
> 
> I modified QueryParser to take 2 analyzer, 1 is the normal analyzer
> that drops 
> all the stop words from the query, and the second analyzer will not
> drop any 
> word from the token.
> 
> And in QueryParser.jj I modified the following:
> 
>      | term=<QUOTED>
>        [ slop=<SLOP> ]
>        [ <CARAT> boost=<NUMBER> ]
>        {
> 	// If quoteAnalyzer is not null use the quoteAnalyzer 
>          if (quoteAnalyzer == null)
>          {
>             q = getFieldQuery(field, analyzer,
>                               term.image.substring(1,
> term.image.length()-1));
>          }
>          else
>          {
>             q = getFieldQuery(field, quoteAnalyzer,
>                               term.image.substring(1,
> term.image.length()-1));
>          }
> 
> 
> > However, would you even want to do something like that?
> > If you use the same Analyzer, with the same list of stop words for
> both
> 
> Yes again the drawback is that I have to use the analyzer that does
> not drop 
> all those words from the search and thus they are indexed. This will
> grow our 
> index to probably a huge amount, but unfortunately this is our
> requirement 
> that we need to be able to search something like
> 
> "apple and orange"
> 
> or 
> 
> "apple for tomato"
> 
> 
> > Otis
> 
> Thanks for the reply.
> 
> victor
> 
> > --- Victor Hadianto <victorh@nuix.com.au> wrote:
> > > Lucene's QueryParsers seems to drop stop/key words even if they
> are
> > > enclosed
> > > in double quotes.
> > >
> > > For example:
> > >
> > > apple for tomato
> > > --> +apple +tomato
> > >
> > > Which is what I expected, however
> > >
> > > "apple for tomato"
> > > --> "apple tomato"
> > >
> > > and "for" in between apple and tomato is conveniently dropped.
> > >
> > > Is there a way to tell QueryParser not to drop those words if
> they
> > > are
> > > enclosed in double quotes?
> > >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online, calculators, forms, and more
http://tax.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message