lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject Re: Load balancing with solr cloud
Date Fri, 21 Oct 2016 09:07:18 GMT
As I understand it for non-SolrCloud aware clients you have to manually load balance your searches,
see ymonad's answer here:

This is from 2014 so maybe this has changed now - would be interested to know as well.
Also, for indexing, I think it's possible to control how many replicas need to confirm to
the leader before the response is supplied to the client, as you can with say MongoDB replicas.


    On Friday, October 21, 2016 1:18 AM, Garth Grimm <>

 No matter where you send the update to initially, it will get sent to the leader of the shard
first.  The leader does a parsing of it to ensure it can be indexed, then it will send it
to all the replicas in parallel.  The replicas will do their parsing and report back that
they have persisted the data to their tlogs.  Once the leader hears back from all the replicas,
the leader will reply back that the update is complete, and your client will receive it's
HTTP response on the transaction.

At least that's the general case flow.

So it really won't matter how your load balancing is handled above the cloud.  All the work
is done the same way, with the leader having to do slightly more work than the replicas.

If you can manage to initially send all the updates to the correct leader, you can skip one
hop before the work starts, which may buy you a small performance boost compared to randomly
picking a node to send the request to.  But you'll need to be taxing the cloud pretty heavily
before that difference becomes too noticeable.

-----Original Message-----
From: Sadheera Vithanage [] 
Sent: Thursday, October 20, 2016 5:55 PM
Subject: Re: Load balancing with solr cloud

Thank you very much John and Garth,

I've tested it out and it works fine, I can send the updates to any of the solr nodes.

If I am not using a zookeeper aware client and If I direct all my queries (read queries) always
to the leader of the solr instances,does it automatically load balance between the replicas?

Or do I have to hit each instance in a round robin way and have the load balanced through
the code?

Please advise the best way to do so..

Thank you very much again..

On Fri, Oct 21, 2016 at 9:18 AM, Garth Grimm <>

> Actually, zookeeper really won't participate in the update process at all.
> If you're using a "zookeeper aware" client like SolrJ, the SolrJ 
> library will read the cloud configuration from zookeeper, but will 
> send all the updates to the leader of the shard that the document is meant to go to.
> If you're not using a "zookeeper aware" client, you can send the 
> update to any of the solr nodes, and they will evaluate the cloud 
> configuration information they've already received from zookeeper, and 
> then forward the document to leader of the shard that will handle the document update.
> In general, Zookeeper really only provides the cloud configuration 
> information once (at most) during all the updates, the actual document 
> update only gets sent to solr nodes.  There's definitely no need to 
> distribute load between zookeepers for this situation.
> Regards,
> Garth Grimm
> -----Original Message-----
> From: Sadheera Vithanage []
> Sent: Thursday, October 20, 2016 5:11 PM
> To:
> Subject: Load balancing with solr cloud
> Hi again Experts,
> I have a question related to load balancing in solr cloud.
> If we have 3 zookeeper nodes and 3 solr instances (1 leader, 2 
> secondary replicas and 1 shard), when the traffic comes in the primary 
> zookeeper server will be hammered, correct?
> I understand (or is it wrong) that zookeeper will load balance between 
> solr nodes but if we want to distribute the load between zookeeper 
> nodes as well, what is the best approach.
> Cost is a concern for us too.
> Thank you very much, in advance.
> --
> Regards
> Sadheera Vithanage


Sadheera Vithanage

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message