lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: QueryParser question
Date Tue, 31 Dec 2002 19:51:54 GMT
Doug Cutting wrote:
> However, in most cases where this is an issue, the real problem is that 
> folks are placing too much reliance on the query parser.  The query 
> parser is designed for user-entered queries.  If you're programmatically 
> generating query strings that are then fed to the query parser, then you 
> would be better served by directly constructing queries.

This bears emphasis.  Abuse of the query parser may be the single most 
common source of problems with Lucene.  We should probably add 
guidelines for query parser use to the FAQ and/or query parser 

Some rules of thumb are:

- If you are programmatically generating a query string and then parsing 
it with the query parser then you should seriously consider building 
your queries directly with the query API.  In other words, the query 
parser is designed for human-entered text, not for program-generated text.

- Untokenized fields are best added directly to queries, and not through 
the query parser.  If a field's values are generated programmatically by 
the application, then so should query clauses for this field. 
Analyzers, like the query parser, are designed to convert human-entered 
text to terms.  Program-generated values, like dates, keywords, etc., 
should be consistently program-generated.

- In a query form, fields which are general text should use the query 
parser.  All others, e.g., date ranges, keywords, etc. are better added 
directly through the query API.  A field with a limit set of values, 
that can be specified with a pulldown menu should not be added to a 
query string which is subsequently parsed, but rather added as a 
TermQuery clause.

I hope that by saying the same thing several times in slightly different 
ways folks will get the idea!  Of course, these are not absolute rules: 
there are exceptions.  The query parser can do more than it should.  But 
when this is done, problems frequently occur.  Caveat emptor.


To unsubscribe, e-mail:   <>
For additional commands, e-mail: <>

View raw message