lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver Christ" <>
Subject Suggesters: circumfix suggestions
Date Wed, 16 Jan 2013 21:27:33 GMT


Has anyone tried to implement circumfix suggesters, where the suggestion
is a circumfix of the lookup string? 


E.g. "sox rumor" suggests "boston red sox rumors" (try it on


I think there are several of ways to implement this: 


*         Given some multiword term, add all word subsequences to the
suggester individually ("boston red sox rumors" adds also "red sox
rumors", "sox rumors", "rumors") - that can be achieved using a special
TermFreqIterator. This turns the lookup problem into a standard prefix
search. While this works, it effectively modifies the surface form, and
the "full term" needs to be indexed and looked up elsewhere.

*         Constructing a token graph with appropriate substring arcs
from the (hopefully linear) token sequence, using a special TokenFilter.
The benefit is that the surface form is always the same, but the
automaton may become large (at least if you are using an

*         DIY, using suffix arrays or something similar.


But I'm sure there are other ways and/or tradeoffs I haven't thought
about J I'd be interested in your feedback.


Cheers, Oli


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message