lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (Commented) (JIRA)" <>
Subject [jira] [Commented] (SOLR-3029) Poor json formatting of spelling collation info
Date Thu, 29 Mar 2012 15:02:27 GMT


James Dyer commented on SOLR-3029:


I can answer some of your questions.  I do agree the spellcheck response format leaves something
to be desired and maybe 4.0 is a good time to break backwards and improve it.

Unless order is really important, "suggestions" should be a map
I don't see why order would matter here, although some users might like to see the corrections
listed in the order they appeared in the query.

same for "collation"
The collations are ranked, so order is important.

and "misspellingsAndCorrections"
The order shouldn't matter unless users are picky about the corrections being presented in
the order they occur in the query.

why is "collation" inside "suggestions" along with other words? should this be one level higher?
This always confused me too.  I agree it should be one level higher.

why isn't this giving me multiple collations
This is a bug.  See SOLR-2853.

why aren't multiple suggestions returned in misspellingsAndCorrections? (and what's the purpose
This is nested with the Collation and gives details, for that particular collation, which
misspelled word got which replacement.  This makes it easy for clients to generate messages
like "no results found for abcdefgq ...  Showing abcdefgx instead!"  You can suppress this
information by not specifying "spellcheck.collateExtendedResults=true".  For users (like me)
who are interested in the collations only and don't care about individual-word corrections,
it would be nice if we could suppress the first section of the response entirely.

I briefly tried distributed search...
DistributedSpellCheckComponentTest is supposed to detect problems like this but maybe something
is going on and there is a bug this test isn't catching?

For what its worth you had voiced some misgivings about the JSON format when the multiple-collations
feature was added.  At that time I supplied a quick patch to address your concerns.  I'm not
sure if that patch fixes the problem described here.  See SOLR-2010 and your comment from
Oct 16, 2010 and the (now outdated, never committed) patch I supplied on Oct 20.  

The patch on this issue causes multiple test failures although I didn't look into them.

> Poor json formatting of spelling collation info
> -----------------------------------------------
>                 Key: SOLR-3029
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 4.0
>            Reporter: Antony Stubbs
>            Priority: Blocker
>         Attachments: SOLR-3029.patch
> {noformat}
> "spellcheck": {
>     "suggestions": [
>     "dalllas",
>     {
> <snip>
>         {
>             "word": "canallas",
>             "freq": 1
>         }
>         ]
>     },
>     "correctlySpelled",
>     false,
>     "collation",
>     "dallas"
>     ]
> }
> {noformat}
> The correctlySpelled and collation key/values are stored as consecutive elements in an
array - quite odd. Is there a reason isn't not a key/value map like most things?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message