lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Otis Gospodnetic (JIRA)" <j...@apache.org>
Subject [jira] Commented: (SOLR-572) Spell Checker as a Search Component
Date Tue, 27 May 2008 22:28:59 GMT

    [ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600294#action_12600294
] 

Otis Gospodnetic commented on SOLR-572:
---------------------------------------

Right, Google only shows you the final output, not what they do in the backend.
But the fact that they italicize misspelled words tells us they have a mechanism that allows
the front end to identify them.
So I think our task here is to figure out the best/easiest way for the client to identify
misspelled words and offer the alternative query to the end user.

I think what I outlined above will do that for us:
* output all words sequentially
* mark the words that are misspelled - it may be best to return the original word plus corrected
word:

<word="london"/> <!-- unchanged -->
<word="brigge">bridge</word>

or maybe with offset info:

<word="london" offset="0"/> <!-- unchanged -->
<word="brigge" offset="6">bridge</word>

It's also fine to (*also*) return the final corrected string that doesn't mark the corrected
words in any way, and let the "lazy" clients just use that.

Grant or Shalin, will either of you be adding this?


> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch,
SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the following features:
> * Allow creating a spell index on a given field and make it possible to have multiple
spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and process
each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message