lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (JIRA)" <>
Subject [jira] [Commented] (SOLR-2378) FST-based Lookup (suggestions) for prefix matches.
Date Thu, 31 Mar 2011 11:33:05 GMT


Dawid Weiss commented on SOLR-2378:

I didn't have time to take care of this until now, apologies. So, looking at Lookup#lookup(),
I just wanted to clarify:

   * Look up a key and return possible completion for this key.
   * @param key lookup key. Depending on the implementation this may be
   * a prefix, misspelling, or even infix.
   * @param onlyMorePopular return only more popular results
   * @param num maximum number of results to return
   * @return a list of possible completions, with their relative weight (e.g. popularity)
  public abstract List<LookupResult> lookup(String key, boolean onlyMorePopular, int

the "onlyMorePopular" means more popular than... what? I see TSTLookup and JaspellLookup (Andrzej,
will you confirm, please?) sorts matches in a priority queue by their associated value (frequency
I guess). This makes sense, but onlyMorePopular is misleading -- it should be called onlyMostPopular
(those with the native knowledge of English subtlieties, speak up if I'm right here).

I also see and wanted to confirm -- the Dictionary can come from various sources, so we can't
rely on the presence of the built-in Lucene automaton, can we? Even if I wanted to reuse it,
there'd be no easy way to determine if it's a full automaton, or a partial one (because of
the gaps/trimming)... I think I'll just implement the solution by building the automaton from
whatever Dictionary comes in and serializing/ deserializing it similar to TSTLookup.

Sounds ok?

> FST-based Lookup (suggestions) for prefix matches.
> --------------------------------------------------
>                 Key: SOLR-2378
>                 URL:
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>              Labels: lookup, prefix
>             Fix For: 4.0
> Implement a subclass of Lookup based on finite state automata/ transducers (Lucene FST
package). This issue is for implementing a relatively basic prefix matcher, we will handle
infixes and other types of input matches gradually.

This message is automatically generated by JIRA.
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message