lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Byrne <john.by...@propylon.com>
Subject Re: BooleanQuery TooManyClauses in wildcard search
Date Fri, 30 Nov 2007 19:07:18 GMT
Hi,

Your problem is that when you do a wildacrd search, Lucene expands the 
wildacrd term into all possible terms. So, searching for "stat*" 
produces a list of terms like "state", "states", "stating" etc. (It only 
uses terms that actually occur in your index, however). These terms are 
all added as OR clauses of a boolean query.

The thing is, be defult, there is a limit of 1024 caluses for a boolean 
query. If yuor wildacrd term expands into more than this, (which happens 
very easily), you get that exception you described. You can solve the 
issues by setting the maximum clause count yourself, using

BooleanQuery.setMaxClauseCount(int maxClauseCount)

See 
http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/core/index.html 
for mroe info.

Bear in mind that putting a wildcard near the start of the term results 
in a large number of boolean clauses, which increases memory usage. This 
is the reason for the default limit. This limit will also affect fuzzy 
queries, because they are expanded in the same way.

Regards,
JB

Ruchi Thakur wrote:
>  
>   Hi there. 
> I am a new Lucene user and I have been searching the group archives but couldn't solve
the problem. I have just joined a project that uses Lucene. 
> We use the StandardAnalyzer for indexing our documents and our query is as 
> follows when we issue a search string of    t*      for example:
>   +t* +cont_type:pa
>    
>   We get an Exception when we issue some of our wildcard text searches we get following
Exception
>   org.apache.lucene.search.BooleanQuery$TooManyClauses Exception : Max clause if 1024
>    
>   Please suggest.
>    
>   Regards,
>   Ruchi
>    
>    
>    
>    
>   
>  
>
>        
> ---------------------------------
> Never miss a thing.   Make Yahoo your homepage.
>   
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition. 
> Version: 7.5.503 / Virus Database: 269.16.11/1161 - Release Date: 30/11/2007 12:12
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message