lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jaroslaw Rozanski ...@jarekrozanski.com>
Subject Re: Separating Search and Indexing in SolrCloud
Date Fri, 16 Dec 2016 21:05:27 GMT
Thanks,


On 16/12/16 20:56, Shawn Heisey wrote:
> On 12/16/2016 5:43 AM, Jaroslaw Rozanski wrote:
>> Leader is responsible for distributing update requests to replica. So
>> eventually all replicas have same state as leader. Not a problem. It
>> is more about the performance of such. If I gather correctly normal
>> replication happens by standard update request. Not by, say, segment
>> copy. 
> 
> For SolrCloud, yes.  The master/slave replication that existed before
> SolrCloud does work by copying segment files, but SolrCloud does not
> work that way.  The old master/slave replication feature IS used by
> SolrCloud, but ONLY for index recovery -- copying the entire index from
> the leader to another replica in the event that the replica gets so far
> behind that it cannot be brought current by regular updates and/or the
> transaction log.  This is also used to make new replicas.
> 
>> Hence, if my understanding is correct, sending search request to
>> replica only, in index heavy environment, would bring no benefit. 
> 
> Correct, it would have no benefit.  There's something else: when you
> send queries to SolrCloud, they do not necessarily stay on the node
> where you sent them.  By default, multiple query requests are load
> balanced across the cloud, so they'll hit the leader anyway, even if you
> never send them to the leader.

With custom Solr Client the above logic no longer applies to my case. I
can easily control to which replica/core in shard my query is directed
to (along with distrib=false).

>> So the question is: is there a mechanism, in SolrCloud (not legacy
>> master/slave set-up) to make one node take a load of indexing which
>> other nodes focus on searching. 
> 
> Indexing will always be done by all replicas, including the leader.
> 
> Something to mention, although it doesn't accomplish what you're after: 
> There is a preferLocalShards parameter that you can send with your query
> to keep SolrCloud from doing its load balancing *if* the query can be
> satisfied from local indexes.  This parameter should only be used in one
> of the following situations:
> 
> * Your query rate is very low.
> * You are already load balancing the requests yourself.
> 
> If the preferlocalShards parameter is used in other situations, it can
> end up concentrating a large number of requests onto some replicas and
> leaving the other replicas idle.
> 
> https://cwiki.apache.org/confluence/display/solr/Distributed+Requests#DistributedRequests-PreferLocalShards


Yeap, already solved. I am more concerned with indexing memory
requirements at volume affecting performance of search requests and/or
cluster stability.

> Thanks,
> Shawn
> 



-- 
Jaroslaw Rozanski | e: me@jarekrozanski.com
695E 436F A176 4961 7793  5C70 AFDF FB5E 682C 4D3D


Mime
View raw message