lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <>
Subject [jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.
Date Thu, 07 Apr 2011 11:46:05 GMT


Dawid Weiss commented on SOLR-2378:

Ok, updated patch. The only thing I would like to add is a real Solr handler test, much like
SuggesterTest. Should I add a separate test class or simply add another handler to that config
file and test methods to SuggesterTest?

Also, this one puzzled me:
    threshold = config.get(THRESHOLD_TOKEN_FREQUENCY) == null ? 0.0f
            : (Float)config.get(THRESHOLD_TOKEN_FREQUENCY);
What are the conversion rules for NamedList that SolrSpellChecker gets in init()? I have an
Integer parameter, but didn't check what is returned for, say, "12" (String, Float, Integer?).

> FST-based Lookup (suggestions) for prefix matches.
> --------------------------------------------------
>                 Key: SOLR-2378
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>              Labels: lookup, prefix
>             Fix For: 4.0
>         Attachments: SOLR-2378.patch
> Implement a subclass of Lookup based on finite state automata/ transducers (Lucene FST
package). This issue is for implementing a relatively basic prefix matcher, we will handle
infixes and other types of input matches gradually. Impl. phases:
> - -write a DFA based suggester effectively identical to ternary tree based solution right
> - -baseline benchmark against tern. tree (memory consumption, rebuilding speed, indexing
speed; reuse Andrzej's benchmark code)-
> - -modify DFA to encode term weights directly in the automaton (optimize for onlyMostPopular
> - -benchmark again-
> - add infix suggestion support with prefix matches boosted higher (?)
> - benchmark again
> - modify the tutorial on the wiki []

This message is automatically generated by JIRA.
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message