lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Okke Klein (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SOLR-2993) Integrate WordBreakSpellChecker with Solr
Date Sat, 07 Jan 2012 16:11:39 GMT

    [ https://issues.apache.org/jira/browse/SOLR-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13182013#comment-13182013
] 

Okke Klein commented on SOLR-2993:
----------------------------------

If I am not mistaken the functionality from https://issues.apache.org/jira/browse/SOLR-2585
can also be achieved in DirectSolrSpellChecker with thresholdTokenFrequency parameter. So
I patched trunk with this patch and the corresponding Lucene patch and did some experimenting.

The misplaced whitespaces were fixed and proper suggestions were returned. However if both
word parts resulted in suggestions, the collation made no sense.

Hypothetical example:
"spe llcheck" would give suggestions "spa" and "spellcheck" and collate this into "spa spellcheck"

In my use case I never got any results back when one of the parts had a typo. So "spe llchek"
would not give any suggestions.

For my use case it would also be handy if "spell check" would result in the suggestion "spellcheck".

Or is this already possible?
                
> Integrate WordBreakSpellChecker with Solr
> -----------------------------------------
>
>                 Key: SOLR-2993
>                 URL: https://issues.apache.org/jira/browse/SOLR-2993
>             Project: Solr
>          Issue Type: Improvement
>          Components: SolrCloud, spellchecker
>    Affects Versions: 4.0
>            Reporter: James Dyer
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: SOLR-2993.patch
>
>
> A SpellCheckComponent enhancement, leveraging the WordBreakSpellChecker from LUCENE-3523:
> - Detect spelling errors resulting from misplaced whitespace without the use of shingle-based
dictionaries.  
> - Seamlessly integrate word-break suggestions with single-word spelling corrections from
the existing FileBased-, IndexBased- or Direct- spell checkers.  
> - Provide collation support for word-break errors including cases where the user has
a mix of single-word spelling errors and word-break errors in the same query.  
> - Provide shard support.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message