lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir" <rcm...@gmail.com>
Subject Re: FastSSFuzzy for faster fuzzy queries in Lucene
Date Tue, 06 Jan 2009 16:32:11 GMT
hi,
yes, the results that come back from the lucene index i verify at runtime
before expanding the query. i considered trying to store delete positions as
payloads or something but fastssWC is good enough for me.

i'll see about posting my code today.

On Tue, Jan 6, 2009 at 4:52 AM, Thomas Bocek <bocek@ifi.uzh.ch> wrote:

> Hi Robert,
>
> Robert Muir wrote:
> > hi, I'm actually working on doing just this (though I haven't created a
> > jira ticket).
> >
> > the way i have it working is by creating a secondary lucene index. the
> > size of this secondary index is determined primarily by number of unique
> > tokens and length of the tokens. if you read the fastss paper, the
> > specific algorithm I am using is FastSSWc with k=1.
>
> FastSSWc does not require to store the positions of delete characters of
> neighbors, which saves space. However, some candidantes, which have been
> searched using k=1, may then be k=2.
>
> > my specific implementation has a bunch of other algorithms other than
> > fastss involved so i just need to separate this stuff out and clean
> > everything up a bit. as soon as i can get it in any reasonable shape
> > i'll attach it to a jira ticket... maybe you want to help?
>
> Can you point me to the code?
>
> Regards,
>
> Thomas
>
>


-- 
Robert Muir
rcmuir@gmail.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message