lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2993) Integrate WordBreakSpellChecker with Solr
Date Tue, 10 Jan 2012 15:47:40 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13183339#comment-13183339
] 

James Dyer commented on SOLR-2993:
----------------------------------

{quote}
So should it be possible to get the suggestion "spellcheck" from "spell check", or not?  Note:
I do get suggestions for terms that are in the index.
{quote}

When combining words, it will require that _at least one_ of the original terms be not in
the index.  

So to use your example, WordBreakSpellChecker will combine "spell check" to "spellcheck" provided
that:
1. "spellcheck" is in the index.
2. either:
 - "spell" is NOT in the index.
   -OR-
 - "check" is NOT in the index"
   -OR-
 - both "spell" and "check" are NOT in the index.

But if both "spell" and "check" are in the index, then you won't get "spellcheck" as a suggestion.
 You can override this behavior if:
1. You specify "onlyMorePopular".  This works if "spellcheck" has a document frequency that
is greater or equal than the highest document frequency between "spell" and "check".
2. You apply SOLR-2585 (theoretically...not possible yet) and set "spellcheck.alternativeTermCount"
greater than zero.  This would tell it to generate alternative term suggestions for indexed
terms.

If this is not consistent with what you're experiencing then there is a possible bug in the
WordBreakSpellChecker.  In that case, please provide as many details as possible (or write
a failing unit test) and I can look into it further.
                
> Integrate WordBreakSpellChecker with Solr
> -----------------------------------------
>
>                 Key: SOLR-2993
>                 URL: https://issues.apache.org/jira/browse/SOLR-2993
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud, spellchecker
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2993.patch
>
>
> A SpellCheckComponent enhancement, leveraging the WordBreakSpellChecker from LUCENE-3523:
> - Detect spelling errors resulting from misplaced whitespace without the use of shingle-based
dictionaries.  
> - Seamlessly integrate word-break suggestions with single-word spelling corrections from
the existing FileBased-, IndexBased- or Direct- spell checkers.  
> - Provide collation support for word-break errors including cases where the user has
a mix of single-word spelling errors and word-break errors in the same query.  
> - Provide shard support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message