lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver Christ" <ochr...@ebscohost.com>
Subject Suggesters: circumfix suggestions
Date Wed, 16 Jan 2013 21:27:33 GMT
Hi, 

 

Has anyone tried to implement circumfix suggesters, where the suggestion
is a circumfix of the lookup string? 

 

E.g. "sox rumor" suggests "boston red sox rumors" (try it on
google.com).

 

I think there are several of ways to implement this: 

 

*         Given some multiword term, add all word subsequences to the
suggester individually ("boston red sox rumors" adds also "red sox
rumors", "sox rumors", "rumors") - that can be achieved using a special
TermFreqIterator. This turns the lookup problem into a standard prefix
search. While this works, it effectively modifies the surface form, and
the "full term" needs to be indexed and looked up elsewhere.

*         Constructing a token graph with appropriate substring arcs
from the (hopefully linear) token sequence, using a special TokenFilter.
The benefit is that the surface form is always the same, but the
automaton may become large (at least if you are using an
AnalyzingSuggester).

*         DIY, using suffix arrays or something similar.

 

But I'm sure there are other ways and/or tradeoffs I haven't thought
about J I'd be interested in your feedback.

 

Cheers, Oli

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message