lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Russell M. Allen" <>
Subject RE: PrefixQuery rewrite() bug, ignores max clause count
Date Mon, 17 Jul 2006 13:18:53 GMT
: 1) colorzing your mail doesn't play nicely with the mailing list ...
: sending "diffs" is the prefered way to show potential changes to the

So I discovered on sending the message.  :)

: 2) assuming i understand what it is you've changed, you've "worked
arround" the TooManyClausesException in a somewhat complicated way (you
could just as : easily increase the maxClauseCount and save yourslef
: headache) ... 

Actually, we have increased the maxClauseCount in our production code to
address this problem.  (We don't want to maintain a custom version of

: by doing this you bypass the whole point of having a
: maClauseCount: to prevent a query that will consume too many resources
(namely RAM and query execution time) from being constructed.

I understand the concern for resources, but by throwing an exception you
have also preempted a legitimate query.  How then should one go about
performing a query for all document where a field begins with char X?
We have 60,000+ documents, of which B, S and T seem to be the most
common initial char (just over 7000 document each).  We are building an
'index' with the search results, thus the need for a query like:

IMHO, both your concern for resources and the ability to query for
"name:b*" are legitimate.  Perhaps PrefixeQuery should be able to
gracefully degrade when it hits resource limits.  I admit I am at the
limit of my Lucene knowledge here, but I assume that rewrite() is an
optimization of the query and can be skipped (with some additional code)
at the cost of performance. 

: 3) your change modifies the core impacts from each of the Terms that
match the prefix ... if you don't care about the score impacts from the
terms, then : there are other options.  Solr has a PrefixFilter class
which can be wrapped in a ConstantScoreQuery to support prefix queries
regardless of your term : : distribution.

We do not rank or score results at all (Lucene is a high speed index for
us).  As a result, I am blissfully ignorant of scoring results.  Out of
curiosity though, does the depth of a term in a query tree affect its

I will take a look at PrefixFilter and ConstantScoreQuery.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message