lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Dyer (JIRA)" <>
Subject [jira] [Updated] (SOLR-2571) IndexBasedSpellChecker "thresholdTokenFrequency" fails with a ClassCastException on startup
Date Fri, 03 Jun 2011 15:59:48 GMT


James Dyer updated SOLR-2571:

    Attachment: SOLR-2571.patch

I'm betting the jury will rule we keep this a <float /> element, so here's a patch that
changes DirectSolrSpellChecker.  I also added a unit test for thresholdTokenFrequency and
added a (commented-out) line for it in the example solrconfig.xml.

There are 3 TODO's in the unit test code:
1. My ignorance of the expression language used in unit-tests lead mem write an old-style
long-form unit test.  If someone can show me how to convert this to a 1-liner I would be very

2. I found that DirectSolrSpellChecker returns results in a slightly different format than
IndexBasedSpellChecker.  Is this OK?  Can SOLRJ handle this or do we need to tweak there?

3. Also, in one case IndexBasedSpellChecker returns "correctlySpelled=false" while DirectSolrSpellChecker
returns "correctlySpelled=true".  Is this discrepancy valid?

> IndexBasedSpellChecker "thresholdTokenFrequency" fails with a ClassCastException on startup
> -------------------------------------------------------------------------------------------
>                 Key: SOLR-2571
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>    Affects Versions: 1.4.1, 3.1, 4.0
>            Reporter: James Dyer
>            Priority: Minor
>              Labels: whereIsHossManWhenYouNeedHim
>             Fix For: 3.3, 4.0
>         Attachments: SOLR-2571.patch, SOLR-2571.patch, SOLR-2571.solr3.2.patch
> When parsing the configuration for thresholdTokenFrequency", the IndexBasedSpellChecker
tries to pull a Float from the DataConfig.xml-derrived NamedList.  However, this comes through
as a String.  Therefore, a ClassCastException is always thrown whenever this parameter is
specified.  The code ought to be doing "Float.parseFloat(...)" on the value.
> This looks like a nice feature to use in cases the data contains misspelled or rare words
leading to spurious "correct" queries.  I would have liked to have used this with a project
we just completed however this bug prevented that.  This issue came up recently in the User's
mailing list so I am raising an issue now.

This message is automatically generated by JIRA.
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message