lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mike anderson <>
Subject Fwd: distributed search on duplicate shards
Date Fri, 24 Sep 2010 04:43:22 GMT
Just wanted to poke this since it got buried under a dozen or so Jira
updates. I also sent it to the deprecated list, though I think it should
have forwarded.


---------- Forwarded message ----------
From: mike anderson <>
Date: Thu, Sep 23, 2010 at 7:06 PM
Subject: distributed search on duplicate shards

Hi all,

My company is currently running a distributed Solr cluster with about 15
shards. We occasionally find that one shard will be relatively slow and thus
hold up the entire response. To remedy this we thought it might be useful to
have a system such that:

1. We can duplicate each shard, and thus have "sets" of shards, each with
the same index
2. We can pass in these sets of shards along with the query (for instance,
if "!" is the delimiter, shards=solr1a!solr1b,solr2a!solr2b)
3. The request goes out to /all/ shards (unlike load balancing in Solr
4. The first shard from a set (solr1a, solr1b) to successfully return is
honored, and the other requests (solr1b, if solr1a responds first, for
instance) are removed/ignored
5. The response is completed and returned as soon as one shard from each set

I've written a patch to accomplish this, but have a few questions

1. What are the known disadvantages to such a strategy? (we've thought of a
few, like sets being out of sync, but they don't bother us too much)
2. What would this type of a feature be called? This way I can open a Jira
ticket for it
3. Is there a preferred way to do this? My current patch (wich I can post
soon) works in the HTTPClient portion of SearchHandler. I keep a hash map of
the shard sets and cancel the Future<ShardResponse>'s in the corresponding
set when each response comes back.

Thanks in advance,

P.S I'd like to write a test for this feature but it wasn't clear from the
distributed test how to do so. Could somebody point me in the right
direction (an existing test, perhaps) for how to accomplish this?

View raw message