lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dawid Weiss <dawid.we...@gmail.com>
Subject Re: Suggesters: circumfix suggestions
Date Wed, 16 Jan 2013 22:56:27 GMT
> Eg, you'd index only "boston", "red", "sox", "rumor" into the FST, and
> then have a separate search index with "boston red sox rumor" indexed
> as a document.  If the user types "red so", then you run suggest on
> "red" and on "so", and then run a hmm MultiPhraseQuery for
> (red|redmond|reddit) (so|sox|sophomore|...) against the index?  How to

I know of at least one company (ehm, can't tell by name) that does
this for matching physical locations against user queries (n yo => new
york, etc.). Granted, this is a very closed domain and the boosts can
be pretty well approximated (cities by the number of citizens, streets
by the location they're at etc.).

Good idea to try out though. Another possible alternative would be to
run a frequent phrase extraction algorithm of some sort, then collect
only the best candidate phrases. I bet a lot of these these could be
fit into an FST, perhaps even indexed at every starting token's
position so that infix searches could work. If you need absolutely all
suggestions you'll need to come up with something more clever.

Dawid

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message