lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eks dev <eks...@yahoo.co.uk>
Subject Re: Edit-distance strategy
Date Thu, 08 Jun 2006 14:12:04 GMT
what could show measurably faster results is more in Jaro distance direction, but even than,
the fact that you need to scan all terms in dictionary will make it prohibitive for large
colletions. For small collections this can be packed in some smart data structures to restrict
number of distance calculations...
good luck with it

----- Original Message ----
From: eks dev <eksdev@yahoo.co.uk>
To: java-dev@lucene.apache.org
Sent: Thursday, 8 June, 2006 4:07:07 PM
Subject: Re: Edit-distance strategy


>I'm about to replace the edit-distance algorithm in FuzzyQuery from
>Levenstein to Hirschberg to save a couple of clockticks.

Have you allready confirmed Hirschberg algorithm to be faster than current implementation
of edit distance? I am not convinced it helps really. Hirschberg and standard DP algorithm
both have O(|s1|*|s2|) time. The only saving is that Hirschberg needs O(|s2|) space using
binary recursion (classic needs O(|s1|*|s2|) space). 




---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Mime
View raw message