lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nalini Kartha <nalinikar...@gmail.com>
Subject Lucene Spellchecker versions
Date Wed, 07 Sep 2011 18:49:17 GMT
Hi,

We want to implement some sort of spell correction for search and we're
looking at the Lucene Spellchecker for this.

We're still stuck on Lucene 2.2 though so it looks like the old version that
requires a separate dictionary index to be built is the only option - is
that correct?

For the k-gram based spell checker that requires the separate dictionary
index, is there any supported method for keeping the dictionary index in
sync with the original index, i.e. what's the best way to propagate
adds/deletes on the original index to the dictionary index? Do you recommend
just rebuilding the dictionary index at some regular interval?

I'm also trying to understand how the new spellchecker (FST based) works. My
understanding so far is that we build an FST from the term we're trying to
find corrections for and then I'm sort of fuzzy on how the FST is
intersected with the term dictionary.  Is there any detailed documentation
that explains this?

Also, are there changes to the term dictionary structure (post 2.2) that are
required to support FST based spell correction? If so, what exactly are
those changes? A presentation I saw alluded to Fast Numeric Range queries
introduced in Lucene 2.9 but I didn't quite understand how the two are
related.

Thanks in advance,
Nalini

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message