From "Mike Krimerman (JIRA)" <>
Subject [jira] Updated: (SOLR-395) Spell-check should return frequencies of word and suggestions
Date Thu, 01 Nov 2007 22:27:50 GMT


Mike Krimerman updated SOLR-395:

    Attachment: extended_results.diff

The attached patch combines patches for issues 375, 395, 401 and some more:
# (375) Adds the *exist* property for a single word spell-check - whether the word exists
in dictionary
# Adds the *sp.query.onlyMorePopular* option for returning suggestions that are more popular
than query word(s)
# The *sp.query.extendedResults* implies a multi-word query plus returning frequencies for
each word in query and for each suggestion.
# (401) A minimum *threshold* for adding words to the spell-check dictionary as percent/100
of documents where word should appear.
# *Arguments* prefixed with the 'sp' prefix, backwards compatibility remains.
## _sp.dictionary.indexDir_ - backwards compatible with _spellcheckerIndexDir_
## _sp.dictionary.termSourceField_ - backwards compatible with _termSourceField_
## _sp.dictionary.threshold_ - threshold for words to enter dictionary
## _sp.query.suggestionCount_ - backwards compatible with _suggestionCount_
## _sp.query.accuracy_ - backwards compatible with _accuracy_
## _sp.query.onlyMorePopular_ - only more popular suggestions
## _sp.query.extendedResults_ - multi-word query and a response with frequencies
# (375) A *unit-test* file, extended and modified to test 401
# Formatted extended-results to be more friendly for Python/Ruby

> Spell-check should return frequencies of word and suggestions
> -------------------------------------------------------------
>                 Key: SOLR-395
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Mike Krimerman
>            Priority: Minor
>             Fix For: 1.3
>         Attachments: extended_results.diff, returnFrequencies.patch
> When issuing a spell-check, the word being searched for might be present in the index
with a very low frequency (i.e. a misspelling that made it's way into the index). It might
therefore be helpful if the client receives the frequency of the word plus the frequencies
of each of the suggestions.
> This feature should be optional (using a URL param).

