lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <cutt...@apache.org>
Subject Re: cvs commit: jakarta-lucene/src/java/org/apache/lucene/search FuzzyQuery.java FuzzyTermEnum.java
Date Tue, 14 Sep 2004 16:04:07 GMT
goller@apache.org wrote:
>   QueryParser can now handle minimumSimilarity parameter
>   of FuzzyQuery; FuzzyQuery extended to allow for non-fuzzy
>   prefixes.

This looks great!

It might also be good if one could set the non-fuzzy prefix length used 
by the QueryParser.  As it stands, fuzzy queries with large indexes that 
use QueryParser are so slow they're unusable.  But a default prefix of 
just a couple of characters would make a huge performance improvement.

Another idea might be to, rather than (or in addition to) limiting the 
number of expanded terms by similarity, to limit them by number.  So one 
could keep, e.g., just the top-scoring 100 terms whose score is greater 
than 0.5, or somesuch.  This way FuzzyQuery would never trigger 
BooleanQuery.TooManyClauses.  What do you think?

Thanks for all the great code,

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message