lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Helmut Jarausch <jarau...@igpm.rwth-aachen.de>
Subject FuzzyQuery - rounding bug?
Date Mon, 17 Dec 2007 08:41:58 GMT
Hi,

according to the LiA book the FuzzyQuery distance is computed as

1- distance / min(textlen,targetlen)

Given
def addDoc(text, writer):
    doc = Document()
    doc.add(Field("field", text,
                  Field.Store.YES, Field.Index.TOKENIZED))
    writer.addDocument(doc)
    
addDoc("aaaaa", writer)
addDoc("aaaab", writer)
addDoc("aaabb", writer)
addDoc("aabbb", writer)
addDoc("abbbb", writer)
addDoc("bbbbb", writer)
addDoc("ddddd", writer)

query = FuzzyQuery(Term("field", "aaaaa"),0.8,0)

should find "aaaab' since we have
distance = 1
min(textlen,targetlen) = 5

It does find it with
query = FuzzyQuery(Term("field", "aaaaa"),0.79,0)
though.

Is there a rounding error bug?

(this is with lucene-java-2.2.0-603782)

Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message