lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <>
Subject Re: Spell check of a large text
Date Thu, 11 Dec 2008 13:48:06 GMT
I think I'm missing something here...

Spell checked in what sense?  Sounds to me like you need dictionary  
based spell checking during index, not index based spelling during  
search, right?

How about hooking up something like the Jazzy spell checker into a  
TokenFilter?  Then, as the tokens stream by, you lookup the spelling  
and then add a 1 byte payload to all words that are misspelled.

As for Highlighter, hmmm...  Not sure if there is a way to make a  
Fragmenter/Scorer that was payload aware, such that it would only  
produce fragments (and scores) for sections of the file that have  
these payloads.  Definitely pushing my area of expertise, but maybe  
one of the Highlighter experts can chime in.


On Dec 11, 2008, at 6:18 AM, Lucene User no 1981 wrote:

> Hi,
> the problem is as follows: there is a text, ca. 30kb, it has to be
> spellchecked automatically, there is no manual intervention, no  
> suggestions
> needed. All I would like to achieve is a simple check if there are any
> problems with the spelling or not. It has to be rather fast cause  
> there are
> tons of docs a minute going thru the system. Solutions like
> SpellChecker.exists() don't really apply. Additionally, spelling  
> errors
> could be highlighted - haven't really found any reasonable way of  
> leveraging
> Highlighter for that task.
> Does anyone have any idea how this problem can be addressed with  
> Lucene?
> Regards,
> Mac
> -- 
> View this message in context:
> Sent from the Lucene - Java Users mailing list archive at
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

Grant Ingersoll

Lucene Helpful Hints:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message