lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Did you mean?
Date Tue, 30 Aug 2005 15:13:30 GMT
I wonder if it would further help for the spell checked to make use of
something like WordNet (for English only), where low-frequency words
are "double-checked" against WordNet before considered correct.

Otis

--- Tom White <tom.e.white@gmail.com> wrote:

> On 8/29/05, Chris Lu <chris.lu@gmail.com> wrote:
> > 
> > 
> > Two approaches I can think of:
> > * Use a word list(it may not be the word list you want, but it is
> just
> > a compromise).
> > * Analyze your original index, listing out all words inside.
> > 
> > 
> Using a word list suffers from two problems:
> 1. (Coverage) No word list is complete, they need to be maintained as
> new 
> words are coined, and word lists for different languages vary in
> quality.
> 2. (Useless suggestions) There is little point in making a suggestion
> for 
> words that aren't in the original index (as they wouldn't produce any
> hits).
> 
> For these reasons it is better to use the original index as a source
> of 
> words. It is true that the index will likely contain spelling errors,
> 
> however Lucene Spell Checker provides a way to restrict suggestions
> to words 
> that are more popular than the query term. As misspellings are
> typically 
> rarer than correct spellings this should ensure that misspelled
> suggestions 
> are almost never made. The article quoted above (which I wrote),
> provides a 
> bit more discussion.
> 
> Tom
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message