lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Okke Klein (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2993) Integrate WordBreakSpellChecker with Solr
Date Mon, 09 Jan 2012 20:27:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182767#comment-13182767
] 

Okke Klein commented on SOLR-2993:
----------------------------------

{quote}
This is a problem with collations in general: By default, it simply mashes the top corrections
together, often resulting in nonsense. The solution is to set "spellcheck.maxCollationTries"
to a non-zero value. Doing so will cause the spellchecker to vet the collation possibilities
against the index, resulting in collations that are guaranteed to generate hits.
{quote}

If wordbreak gives back a suggestion of a combined word, a suggestion with a word fragment
with more hits is still ranked higher in the collation.

So "spa llcheck" is preferred over "spellcheck" if spa has more hits then spellcheck.

{quote}
it would also be handy if "spell check" would result in the suggestion "spellcheck". Or is
this already possible?

This is the core of what this issue (really LUCENE-3523) is all about, provided that "spellcheck"
is in the dictionary&index you're using.
{quote}

Never got this working as no suggestions were given when both word fragments were spelled
correctly and the combined word was in the index. (when making typo in combined word the word
was returned as suggestion)
                
> Integrate WordBreakSpellChecker with Solr
> -----------------------------------------
>
>                 Key: SOLR-2993
>                 URL: https://issues.apache.org/jira/browse/SOLR-2993
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud, spellchecker
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2993.patch
>
>
> A SpellCheckComponent enhancement, leveraging the WordBreakSpellChecker from LUCENE-3523:
> - Detect spelling errors resulting from misplaced whitespace without the use of shingle-based
dictionaries.  
> - Seamlessly integrate word-break suggestions with single-word spelling corrections from
the existing FileBased-, IndexBased- or Direct- spell checkers.  
> - Provide collation support for word-break errors including cases where the user has
a mix of single-word spelling errors and word-break errors in the same query.  
> - Provide shard support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message