lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Willnauer <>
Subject Re: Combine WildcardQuerys
Date Tue, 22 Dec 2009 22:38:50 GMT
I might have confused you a bit with my last reply so I will try to
elaborate a little bit.
Internally Lucene uses a method called Query#rewrite() that most
likely rewrites a query into a "primitive" query like BooleanQuery
which consists of TermQuery instances. This is, in the most cases,
completely hidden from the API user and only directly called
to solve very specific and likely advanced problems.
Yet, in order to be scored lucene 2.4 WildcardQuery rewrites to a
BooleanQuery that again consists of TermQueries containing each term
in the index that matches the wildcard pattern passed to it. If you
have hundreds or thousands of terms matching a wildcard ("b*" is
likely to match a lot of terms in the index) Lucene will internal
raise an exception that too many clauses in the BooleanQuery. (see
for details).

In Lucene 2.9 this can be handled differently with a specific rewrite
method that does not rewrite to BooleanQuery but to for instance a
ConstantScoreQuery(FilterQuery). To use this feature you need to
upgrade to at least lucene 2.9. See
and friends for details.


On Tue, Dec 22, 2009 at 2:22 PM, Simon Willnauer
<> wrote:
> Hi Claudio,
> your query setup is fine for what you trying to do but as you are
> using wildcards lucene internally rewrites your wildcardquery again
> into another boolean query containing every term starting with your
> prefix "b". if you use such a small prefix lucene will likely create
> tons of boolean clauses one for each term starting with your prefix.
> Yet, booleanQuery has a limitation (maxclausecount) that will trigger
> the exception you are facing once you hit the clause count limit. You
> can try to raise the limit in booleanquery but you will very likely
> end up with a bad search performance.
> Lucene 2.9 provides alternative rewritemethods for multitermqueries
> like WildCardQuery that perform way better then the plain boolean
> rewrite method. To achieve faster wildcardqueries you will end up with
> a constant score instead of the normal lucene score a booleanquery
> would create. So each hit will have the same constant score assigned
> for your WildCardQuery.
> Look at for details.
> hope that helps
> simon
> On 12/22/09, Claudio Deluca <> wrote:
>> Hello,
>> We currenty have implemeted a search for person by surname and forename with
>> lucene 2.4.1.
>> If both seach fields are filled, then we combine the WildcardQuerys in a
>> BooleanQuery.
>> *
>> BooleanQuery theQuery = new BooleanQuery();
>> theQuery.add(new WildcardQuery(new Term("surname", "foo")), Occur.MUST);
>> theQuery.add(new WildcardQuery(new Term("forename", "b*")), Occur.MUST);*
>> *LuceneSearcherFactory theSearcherFactory =
>> LuceneSearcherFactory.getInstance();
>> Searcher theSearcher = theSearcherFactory.getSearcher();
>> theRewritten = theSearcher.rewrite(theQuery);
>> *In the database there is exactly one Person with surname "foo". When i
>> comment the second term (forename) search works fine.
>> If i run search including term "forename" "b*", the Searcher throws an
>> TooManyClauses Exception white trying to rewrite the Query.
>> While rewriting the searcher seems to find too many possibilities for
>> forenames beginning with "b".
>> How do i have to combine the terms so that lucene search works properly?
>> Thanks,
>> Claudio

View raw message