lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: Understanding Boolean Queries
Date Thu, 29 Apr 2004 17:53:35 GMT
Please don't crosspost to lucene-user and lucene-dev!

Tate Avery wrote:
> 3) The maxClauseCount threshold appears not to care whether or not my
> clauses are 'required' or 'prohibited'... only how many of them there are in
> total.

That's correct.  It is an attempt to stop out-of-memory errors which can 
be caused by very large boolean queries.  With, e.g., wildcard queries, 
this can be used as a denial of service attack.  If you're not afraid of 
this, then feel free to increase this limit.

> 4) My BooleanQuery will prepare its own Scorer instance (i.e.
> BooleanScorer).  And, during this step, it will identify to the scorer which
> clauses are 'required' or 'prohibited'.  And, if more than 32 fall into this
> category, a IndexOutOfBoundsException ("More than 32 required/prohibited
> clauses in query.") is thrown.

That's correct.  This is a limitation of the implementation, which uses 
32-bit masks to accelerate boolean operations.  If this is a problem for 
you, please submit a bug.  The workaround is to construct multiple 
boolean queries and combine them with boolean queries, which is awkward, 
but it does work.

> Now, I am a bit confused at this point.  Does this mean I can make a boolean
> query consisting of up to 1024 clauses as long as no more than 32 of them
> are required or prohibited?

Yes, that's correct.  The first limitation may be easily lifted, but the 
second requires changes to Lucene.

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message