lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Goller <gol...@detego-software.de>
Subject Re: The fuzziness of FuzzyQuery
Date Fri, 13 Aug 2004 10:15:33 GMT
Daniel Naber wrote:
> Hi,
> 
> I think FuzzyQuery is not as useful as it could be, because it's too fuzzy. 
> For a word with 10 characters it allows an edit distance of 4, i.e. almost 
> half of the word can be different. I suggest to add an option so the 
> fuzziness can be configured, as in the attached patch. If nobody objects, 
> I will commit it (plus test cases). I'll later also try to modify 
> QueryParser to support this, but I cannot promise to get that working.
> 
> One thing I don't quite understand is the meaning of scale_factor. Does it 
> make sense to configure that from outside, too?

+1 for these changes.
I think it does not make sense to change scale_factor from outside.
It has to be computed from minimumSimilarity/FUZZY_THRESHOLD so that
the difference for exact matches remains 1.0 (used as boost later).

Christoph


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message