lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brian Mila" <bm...@iastate.edu>
Subject Re: misspelled queries
Date Thu, 26 Jun 2003 22:15:01 GMT
> GSpell is an open source java spell checking API.  It can be found at
> http://umlslex.nlm.nih.gov/nlsRepository/gspell/doc/userDoc/
>
> It incorporates both metaphone (which is similar to SoundEx, I think) and
> ngram algorithms and it is easy to use.
>

That might be an option, but I'm using NLucene and C# so porting a full java
app is more solution than I'm looking for.


> I currently have an application in which a user submits a query to Lucene
> and along the way I use GSpell to check all the terms in the query.  If
any
> are misspelled I underline with a squiggly red line and provide spelling
> suggestions from GSpell if the user right-clicks.
>
> If your spelling correction dictionary is exactly equal to the terms in
your
> index then any misspelled word is also guaranteed not to yield any hits,
and
> any indexed term is guaranteed not to turn up incorrectly spelled.
>

That's not quite what I wanted, actually.  I don't intend to use a
dictionary at all.  My
hope is that the misspelling should be close enough to the correct spelling
that
the soundex code would be the same (i.e., spelling and speling and spellling
would
all have the same soundex code).

> Jon

> >
> > Hi,
> >
> > I've been thinking about trying to implement a misspelled or
> > a similarity match, ala googles "did you mean this ....".  I
> > was thinking of using SoundEx or one of the newer algorithms
> > to find appropriate suggestions.  To do this though I think I
> > would need to enumerate every term in the index,
> > not a pratical solution I suppose.   Has anyone else
> > attempted this or had
> > any success with this idea?
> >
> >  My only other idea would be to generate the SoundEx codes
> > for every term as its indexed and then add those codes to the
> > index in a different field. (fyi, here's a link that explains
> > SoundEx with example code:
> > http://www.codeproject.com/csharp/soundex.asp?target=soundex).
>
> Then the query would search the regular fields and then form a second
> soundex'd query and run it on the soundex field.  Does this sound
plausible?
> I'd be really interested to hear results if anyone has tried this before.
>
> Regards,
> Brian
>
>




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message