lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robert Muir (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3888) split off the spell check word and surface form in spell check dictionary
Date Sat, 24 Mar 2012 17:44:26 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237614#comment-13237614
] 

Robert Muir commented on LUCENE-3888:
-------------------------------------

lemme see if I can help with the test. I feel bad I didn't supply one with the prototype patch.

About the Solr integration: this looks good! We can use a similar approach for autosuggest,
too,
so this could configure the analyzer for LUCENE-3842.

I wonder if we should allow separate configuration of "index" and "query" analyzers? I know
I came up with some use-cases for that for autosuggest, but I'm not sure about spellchecking.
I guess it wouldn't be overkill to allow it though.
                
> split off the spell check word and surface form in spell check dictionary
> -------------------------------------------------------------------------
>
>                 Key: LUCENE-3888
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3888
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: modules/spellchecker
>            Reporter: Koji Sekiguchi
>            Assignee: Koji Sekiguchi
>            Priority: Minor
>             Fix For: 3.6, 4.0
>
>         Attachments: LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch, LUCENE-3888.patch
>
>
> The "did you mean?" feature by using Lucene's spell checker cannot work well for Japanese
environment unfortunately and is the longstanding problem, because the logic needs comparatively
long text to check spells, but for some languages (e.g. Japanese), most words are too short
to use the spell checker.
> I think, for at least Japanese, the things can be improved if we split off the spell
check word and surface form in the spell check dictionary. Then we can use ReadingAttribute
for spell checking but CharTermAttribute for suggesting, for example.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message