lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Jain <rohit.j...@esgyn.com>
Subject Parallel API interface into SOLR
Date Mon, 12 Jun 2017 16:24:00 GMT
Hi folks,

We have a solution where we would like to connect to SOLR via an API, submit a query, and
then pre-process the results before we return the results to our users.  However, in some
cases, it is possible that the results being returned by SOLR, in a large distributed cluster
deployment, is very large.  In these cases, we would like to set up parallel streams, so that
each parallel SOLR worker feeds directly into one of our processes distributed across the
cluster.  That way, we can pre-process those results in parallel, before we consolidate (and
potentially reduce / aggregate) the results further for the user, who has a single client
connection to our solution.  Sort of a MapReduce type scenario where our processors are the
reducers.  We could consume the results as returned by these SOLR Worker processes, or perhaps
have them shuffled based on a shard key, before our processes would receive them.

Any ideas on how this could be done?

Rohit Jain

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message