lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: FuzzyQuery prefix length
Date Mon, 18 Oct 2004 20:44:45 GMT
Daniel Naber wrote:
> Searching for Photokopie~ on a 230,000 document corpus takes 2.3 seconds here 
> (AMD Athlon 2600+; other fuzzy terms get similar performance). As the number 
> of terms doesn't increase so fast with more documents, it will not take 10 
> seconds for 1 million documents. So fuzzy search isn't *that* slow.

How long do non-fuzzy queries take?  What is the ratio?  How about a 
query with multiple fuzzy terms?

If someone launches a service but fails to test it with fuzzy queries, 
will they be subject to inadvertant denial-of-service when a user starts 
using fuzzy queries?  Web-based search is particularly vulnerable.  If a 
query takes a few seconds and the user hits his browser's STOP and 
RELOAD buttons, the first query keeps running on the server.

This is not an imaginary problem.  I have worked with several clients 
who have run into this in deployed applications.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message