lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanyi <need4...@yahoo.com>
Subject Re: Bug in the BooleanQuery optimizer? ..TooManyClauses
Date Fri, 12 Nov 2004 10:13:52 GMT
> It is normally possible to reduce the numbers of such complaints a lot 
> by imposing a minimum prefix length

I've alread limited it to a minimum of 5 characters (abcde*).
I can still easily find (for the first try) situations where it starts to search for minutes.
While another 5 char. partial words are searching for a second.
So, this is not a solution at all.

> and eg. doubling or tripling the max. nr. of clauses.

This is the only useful thing I could do and the other way I've found is similar: Unlimiting
the
number of clauses, but limiting the memory given for java.
It'll the throw an exception if things are getting too hard for the searcher.

Anyway, this avoids DoS attacks, but results in very poor user interface and search abiliy.
For example: "rareword AND commonfragment*" would still refuse to work.
I won't be able to explain it to my users, since they don't need my technical reasons. They'll
only notice that "dodge AND vip*" fails to search instead of returning 1000 documents.

If I unlimit everything and don't care about possible DoS attacks, it is still poor.
It'll search for "dodge AND vip*" for two minutes, just because "vip*" is too common in the
entire
document set.
It doesn't matter that "dodge" is pretty rare and we're AND-ing it with "vip*".



		
__________________________________ 
Do you Yahoo!? 
Check out the new Yahoo! Front Page. 
www.yahoo.com 
 


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message