lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Klaas (JIRA)" <j...@apache.org>
Subject [jira] Closed: (SOLR-375) SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is spelled correctly
Date Thu, 01 Nov 2007 22:42:50 GMT

     [ https://issues.apache.org/jira/browse/SOLR-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Mike Klaas closed SOLR-375.
---------------------------

    Resolution: Invalid
      Assignee: Mike Klaas

Scott, I worked with Mike to produce a patch that integrates all the new features (frequency,
multiwords, thresholding, etc.) into a single patch in SOLR-395.

> SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is
spelled correctly
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-375
>                 URL: https://issues.apache.org/jira/browse/SOLR-375
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>    Affects Versions: 1.2
>         Environment: Tested using: Windows XP, Apache TomCat v5.5.23, Java JDK 1.5.0_12,
Solr v1.2
>            Reporter: Scott Tabar
>            Assignee: Mike Klaas
>             Fix For: 1.3
>
>         Attachments: JIRA_SOLR-375.diff
>
>
> The current implementation of SpellCheckerRequestHandler has some limitations:
> 1. It does not identify if a word is spelled correctly (a match to its index) 
>   a. If a word is spelled correctly, the correct spelling is not included in the suggested
list, so the suggestions cannot be used to deduce if the word is correct
>   b. If the word does not exist in the index and there are no suggestions, the suggestion
list is empty
> 2. No support for multiple words
> I have made some changes to this class that addresses these limitations:
> 1. the key value pair exists=true/false has been added to provide a clear understanding
if the word is in the index or not
> 2. the key value pair words=_words_to_be_checked_ to identify the original word(s) that
was checked and for what the suggestion list is for.  This becomes more important for the
support of multiple words.
> 3. If a parameter key word on the query string exists with the value of multiWords=true,
then support for multiple words is enabled.
>   a. Multiple words are defined by the value of q and are separated by either a space
or +
>   b. Each word is has its own entry in a NamedList object so as to group all result attributes
back to that word: words=, exist=, and suggestions=
>  
> My intended goals is that these changes should not effect existing implementations of
the spell checker within Solr.
> The format of the multiWords support should be easily supported and used within Prototype
if the output type is JSon.
> I have made the changes.  I still need to do some basic testing to ensure all is working
as it is intended, then I will commit to SVN (within 24 hours?).  When I commit, I will also
add more JavaDocs to the class, and also try to attach more comments to this JIRA.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message