lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Troiano <>
Subject CommonsHttpSolrServer and dynamic custom results filtering
Date Mon, 31 Jan 2011 19:47:17 GMT

I'm implementing custom dynamic results filtering to improve fuzzy /
phonetic search support in my search application.  I use the
CommonsHttpSolrServer object to connect remotely to Solr.  I would like to
be able to index multiple fuzzy / phonetic match encodings, e.g. one of the
packaged phonetic encodings, my own phonetic encoding, my own or a packaged
q-gram encoding that will capture string overlap, etc., and then be able to
filter out the results I consider "false positives" in a dynamic, custom
way.  The general approaches I've seen for this are:

1. Use Solr's fuzzy queries.  I haven't been able to achieve acceptable
performance using fuzzy queries, and also the fuzzy queries lack the dynamic
flexibility above.  e.g. whether or not I filter a phonetic match from
results may depend on a lot of things (whether or not there were exact
matches on relevant entities, who the user is, etc), and I can't achieve
this flexibility with a fuzzy field query.

2. Create an RMI-based client/server setup so that I can use the
SolrIndexSearcher to pass in a customer Collector (as in Ch. 9 of Lucene in
Action, but add in a custom Collector).  A custom Collector seems like
exactly what I want but I don't see a way to achieve this using any of the
packaged SolrServer implementations that support a remote setup like this.
I also worry a about the stability of the remote object framework since it's
been moved over to contrib and it seems that there may be serialization
issues or other instability

3. Continue to use the CommonsHttpSolrServer object for querying my index,
but add in post-processing to dynamically filter results.  This seems doable
but unnatural and potentially inefficient given that I need to worry about
supporting pagination and facet counts in such a framework.

Is there an easier way to do custom dynamic results filtering (like via a
custom Collector) while still using CommonsHttpSolrServer?  Do people have
any other suggestions or insights about the approaches summarized above?


View raw message