lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@lucene.com>
Subject Re: too many hits - OutOfMemoryError
Date Wed, 28 May 2003 21:17:49 GMT
[ Moved to lucene-dev.  -drc ]

David_Birthwell@VWR.COM wrote:
> But, wildcard queries that expand to many terms are allways going to be
> memory intensive in Lucene.  We ran into this problem and decided to put a
> check on the number of expanded terms and abort the query if the number got
> too high. 

Perhaps we should make this a feature of Lucene.

Different types of queries which expand in different ways, but they all 
expand into a BooleanQuery.  So perhaps BooleanQuery should get:

   public static int getMaxClauseCount();
   public static void setMaxClauseCount(int maxClauseCount);

When more than the specified number of clauses is added an exception 
would be thrown.

Further, I propose that the default for BooleanQuery.getMaxClauseCount() 
would be 1024.  Each TermQuery requires around 2k bytes to process it. 
This would thus limit expansions to around 2MB, however queries with 
multiple wildcard terms could use more.

This simple fix would probably stop most OutOfMemory problems, which 
affect everyone, while only affecting a very small fraction of queries. 
  The queries that are affected are in most cases probably not useful 
queries anyway.  If someone really wishes to permit terms to expand 
further, then they can always call BooleanQuery.setMaxClauseCount().

Comments?

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message