lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <karl.wri...@nokia.com>
Subject RE: LevenshteinFilter proposal
Date Fri, 23 Jul 2010 15:09:59 GMT
Glad I asked.

I would think that the automaton would be superior even for larger edit distances than 1 or
2 than the equivalent “crappy” algorithm.  But maybe I don’t understand something. ;-)

Karl


From: ext Robert Muir [mailto:rcmuir@gmail.com]
Sent: Friday, July 23, 2010 11:05 AM
To: dev@lucene.apache.org
Subject: Re: LevenshteinFilter proposal

this is actually done in trunk.

In trunk fuzzy's enum is a "proxy". for low distances (ed=1,2) it uses automaton.

for higher distances it uses the crappy "brute force" method.
but, higher distances still get accelerated if you use a reasonable 'maxExpansions' to FuzzyQuery...
the default is quite bad (1024).


On Fri, Jul 23, 2010 at 10:59 AM, <karl.wright@nokia.com<mailto:karl.wright@nokia.com>>
wrote:
Thanks!

FuzzyQuery will do for my purposes, for the interim.  But I suspect that FuzzyQuery could
be made a lot more efficient if it were rebuilt on top of Automaton, no?  I understand that
this would be a trunk project.

Karl


From: ext Uwe Schindler [mailto:uwe@thetaphi.de<mailto:uwe@thetaphi.de>]
Sent: Friday, July 23, 2010 10:45 AM

To: dev@lucene.apache.org<mailto:dev@lucene.apache.org>
Subject: RE: LevenshteinFilter proposal

Automaton is only in Lucene/Solr Trunk. To get a filter out of FuzzyQuery, use MultiTermQueryWrapperFilter(new
FuzzyQuery(…))

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de<http://www.thetaphi.de/>
eMail: uwe@thetaphi.de<mailto:uwe@thetaphi.de>

From: karl.wright@nokia.com<mailto:karl.wright@nokia.com> [mailto:karl.wright@nokia.com<mailto:karl.wright@nokia.com>]
Sent: Friday, July 23, 2010 4:25 PM
To: dev@lucene.apache.org<mailto:dev@lucene.apache.org>
Subject: LevenshteinFilter proposal

Hi Folks,

I’m very interested in using (or developing!) a Levenshtein Filter within the family of
Solr Filter objects. I don’t see such a class today anywhere. I see how the AutomatonQuery
object would permit such a thing to be built, but to date I don’t know of anyone who has
built one. Do you?  If not, I’m willing to give it a whirl.  Also, AutomatonQuery doesn’t
seem to come up when I look for it in the javadocs for Lucene – can you point me in the
correct direction?
Thanks!
Karl





--
Robert Muir
rcmuir@gmail.com<mailto:rcmuir@gmail.com>
Mime
View raw message