lucene-solr-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Tabar (JIRA)" <j...@apache.org>
Subject [jira] Updated: (SOLR-375) SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is spelled correctly
Date Thu, 11 Oct 2007 06:01:50 GMT

     [ https://issues.apache.org/jira/browse/SOLR-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Scott Tabar updated SOLR-375:
-----------------------------

    Attachment: JIRA_SOLR-375.diff

This patch includes the modifications to the SpellCheckerRequestHandler along with JUnit tests
(new) and related configuration files for the support of the JUnit tests.

All JUnit tests complete successfully and the changes made to the SpellCheckerRequestHandler
behaves as expected.

> SpellCheckerRequestHandler improvements to handle multiWords and identify if a word is
spelled correctly
> --------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-375
>                 URL: https://issues.apache.org/jira/browse/SOLR-375
>             Project: Solr
>          Issue Type: Improvement
>          Components: clients - java
>    Affects Versions: 1.2
>         Environment: Tested using: Windows XP, Apache TomCat v5.5.23, Java JDK 1.5.0_12,
Solr v1.2
>            Reporter: Scott Tabar
>             Fix For: 1.2
>
>         Attachments: JIRA_SOLR-375.diff
>
>
> The current implementation of SpellCheckerRequestHandler has some limitations:
> 1. It does not identify if a word is spelled correctly (a match to its index) 
>   a. If a word is spelled correctly, the correct spelling is not included in the suggested
list, so the suggestions cannot be used to deduce if the word is correct
>   b. If the word does not exist in the index and there are no suggestions, the suggestion
list is empty
> 2. No support for multiple words
> I have made some changes to this class that addresses these limitations:
> 1. the key value pair exists=true/false has been added to provide a clear understanding
if the word is in the index or not
> 2. the key value pair words=_words_to_be_checked_ to identify the original word(s) that
was checked and for what the suggestion list is for.  This becomes more important for the
support of multiple words.
> 3. If a parameter key word on the query string exists with the value of multiWords=true,
then support for multiple words is enabled.
>   a. Multiple words are defined by the value of q and are separated by either a space
or +
>   b. Each word is has its own entry in a NamedList object so as to group all result attributes
back to that word: words=, exist=, and suggestions=
>  
> My intended goals is that these changes should not effect existing implementations of
the spell checker within Solr.
> The format of the multiWords support should be easily supported and used within Prototype
if the output type is JSon.
> I have made the changes.  I still need to do some basic testing to ensure all is working
as it is intended, then I will commit to SVN (within 24 hours?).  When I commit, I will also
add more JavaDocs to the class, and also try to attach more comments to this JIRA.
>  

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message