lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tate Avery <>
Subject RE: Understanding Boolean Queries
Date Thu, 29 Apr 2004 17:30:10 GMT
Thank you for the response.

I am not using the QueryParser directly... it was just part of my overall
understanding of how this exception is coming about.  Same thing,
essentially, with the maxClauseCount.

Here is some code to illustrate what is confusing me and what I am trying to

	int _numClauses = XXX;
	boolean _required = XXX;  // 3 examples of these var settings below

	BooleanQuery _query = new BooleanQuery();

	for (int _i = 0; _i < _numClauses; _i++)
			new BooleanClause(
				new TermQuery(new Term("body", "term" + _i)),

	Hits _hits = new IndexSearcher(INDEX_DIR).search(_query);

1) With _numClauses=9999 and _required=false (for example), I have no
(This is confusing since 9999 is more than maxClauseCount... but I won't

2) With _numClauses=32 and _required=true, I also have no problems.

3) With _numClauses=33 and _required=true, I get
"java.lang.IndexOutOfBoundsException: More than 32 required/prohibited
clauses in query." as a runtime exception.

So, I guess I am trying to ask the following:

Is a query like (T1 AND T2 AND ... AND T32 AND T33) just completely illegal
for Lucene?
OR is there some way to extend this limit?
OR am I missing something that is clouding my understanding?


-----Original Message-----
From: Stephane James Vaucher []
Sent: Thursday, April 29, 2004 1:10 PM
To: Lucene Users List;
Subject: Re: Understanding Boolean Queries

On Thu, 29 Apr 2004, Tate Avery wrote:

> Hello,
> I have been reviewing some of the code related to boolean queries and I
> wanted to see if my understanding is approximately correct regarding how
> they are handled and, more importantly, the limitations.

You can always submit requests for enhancements in bugzilla, so as to keep
track this issue.

> Here is what I have come to understand so far:
> 1) The QueryParser code generated from javacc will parse my boolean query
> and determine for each clause whether or not is 'required' (based on a few
> conditions, but, in short, whether or not it was introduced or followed by
> 'AND') or 'prohibited' (based, in short, on it being preceded by 'NOT').

Your usage seems pretty particular, why are you using the javacc

> 2) As my BooleanQuery is being constructed, it will throw a
> BooleanQuery.TooManyClauses exception if I exceed
> BooleanQuery.maxClauseCount (which defaults to 1024).

It's configurable through sys properties or by
BooleanQuery.setMaxClauseCount(int maxClauseCount)
> 3) The maxClauseCount threshold appears not to care whether or not my
> clauses are 'required' or 'prohibited'... only how many of them there are
> total.
> 4) My BooleanQuery will prepare its own Scorer instance (i.e.
> BooleanScorer).  And, during this step, it will identify to the scorer
> clauses are 'required' or 'prohibited'.  And, if more than 32 fall into
> category, a IndexOutOfBoundsException ("More than 32 required/prohibited
> clauses in query.") is thrown.
> That's as far as I got.
> Now, I am a bit confused at this point.  Does this mean I can make a
> query consisting of up to 1024 clauses as long as no more than 32 of them
> are required or prohibited?  This doesn't seem right.  So, am I missing
> something in the way I am understanding this.
> I am (as you may have guessed) generating large boolean queries.  And, in
> some rare cases, I am receiving the exception identified in #4 (above).
> I am trying to figure out whether or not I need to change/filter my
> in a special way in order to avoid this exception.  And, in order to do
> this, I want to understand how these queries are being handled.
> Finally, is there something related to the query syntax that could be my
> mistake?  For example, what is the difference between:
> 	"A B" AND "C D" AND "D E"
> ... and...
> 	("A B") AND ("C D") AND ("D E")
> ... could that be the crux of it?

I can't help you here, and the doc seems rather thin (or nonexistent for
this class). I don't know the relation between the query and how the
scorer will process it.

Sorry I can't be of assistance,

> Thank you for your time,
> Tate Avery
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message