lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Wartes <>
Subject Re: Replicas for same shard not in sync
Date Wed, 27 Apr 2016 19:20:10 GMT
I didn’t leave it out, I was asking what it was. I’ve been reading around some more this
morning though, and here’s what I’ve come up with, feel free to correct.

Continuing my scenario:

If you did NOT specify min_rf
5. leader sets leader_initiated_recovery in ZK for the replica with the failure. Hopefully
that replica notices and re-syncs at some point, because it can’t become a leader until
it does. (SOLR-5495, SOLR-8034)
6. leader returns success to the client (

If you specified a min_rf and it WAS achieved:

5. leader sets leader_initiated_recovery in ZK for the replica with the failure.

6. leader returns success (and the achieved rf) to the client (SOLR-5468, SOLR-8062)

If you specified a min_rf and it WASN'T achieved:
5. leader does NOT set leader_initiated_recovery (SOLR-8034)
6. leader returns success (and the achieved rf) to the client (SOLR-5468, SOLR-8062)

I couldn’t seem to find anyplace that’d cause an error return to the client, aside from
race conditions around who the leader should be, or if the update couldn’t be applied to
the leader itself.

On 4/26/16, 8:22 PM, "Erick Erickson" <> wrote:

>You left out step 5... leader responds with fail for the update to the
>client. At this point, the client is in charge of retrying the docs.
>Retrying will update all the docs that were successfully indexed in
>the failed packet, but that's not unusual.
>There's no real rollback semantics that I know of. This is analogous
>to not hitting minRF, see:
>In particular the bit about "it is the client's responsibility to
>re-send it"...
>There's some retry logic in the code that distributes the updates from
>the leader as well.
>On Tue, Apr 26, 2016 at 12:51 PM, Jeff Wartes <> wrote:
>> At the risk of thread hijacking, this is an area where I don’t know I fully understand,
so I want to make sure.
>> I understand the case where a node is marked “down” in the clusterstate, but
what if it’s down for less than the ZK heartbeat? That’s not unreasonable, I’ve seen
some recommendations for really high ZK timeouts. Let’s assume there’s some big GC pause,
or some other ephemeral service interruption that recovers very quickly.
>> So,
>> 1. leader gets an update request
>> 2. leader makes update requests to all live nodes
>> 3. leader gets success responses from all but one replica
>> 4. leader gets failure response from one replica
>> At this point we have different replicas with different data sets. Does anything
signal that the failure-response node has now diverged? Does the leader attempt to roll back
the other replicas? I’ve seen references to leader-initiated-recovery, is this that?
>> And regardless, is the update request considered a success (and reported as such
to the client) by the leader?
>> On 4/25/16, 12:14 PM, "Erick Erickson" <> wrote:
>>>Yes, deleting and re-adding the replica will be fine.
>>>Having commits happen from the client when you _also_ have
>>>autocommits that frequently (10 seconds and 1 second are pretty
>>>aggressive BTW) is usually not recommended or necessary.
>>>bq: if one or more replicas are down, updates presented to the leader
>>>still succeed, right?  If so, tedsolr is correct that the Solr client
>>>app needs to re-issue update....
>>>Absolutely not the case. When the replicas are down, they're marked as
>>>down by Zookeeper. When then come back up they find the leader through
>>>Zookeeper magic and ask, essentially "Did I miss any updates"? If the
>>>replica did miss any updates it gets them from the leader either
>>>through the leader replaying the updates from its transaction log to
>>>the replica or by replicating the entire index from the leader. Which
>>>path is followed is a function of how far behind the replica is.
>>>In this latter case, any updates that come in to the leader while the
>>>replication is happening are buffered and replayed on top of the index
>>>when the full replication finishes.
>>>The net-net here is that you should not have to track whether updates
>>>got to all the replicas or not. One of the major advantages of
>>>SolrCloud is to remove that worry from the indexing client...
>>>On Mon, Apr 25, 2016 at 11:39 AM, David Smith
>>><> wrote:
>>>> Erick,
>>>> So that my understanding is correct, let me ask, if one or more replicas
are down, updates presented to the leader still succeed, right?  If so, tedsolr is correct
that the Solr client app needs to re-issue updates, if it wants stronger guarantees on replica
consistency than what Solr provides.
>>>> The “Write Fault Tolerance” section of the Solr Wiki makes what I believe
is the same point:
>>>> "On the client side, if the achieved replication factor is less than the
acceptable level, then the client application can take additional measures to handle the degraded
state. For instance, a client application may want to keep a log of which update requests
were sent while the state of the collection was degraded and then resend the updates once
the problem has been resolved."
>>>> Kind Regards,
>>>> David
>>>> On 4/25/16, 11:57 AM, "Erick Erickson" <> wrote:
>>>>>bq: I also read that it's up to the
>>>>>client to keep track of updates in case commits don't happen on all the
>>>>>This is not true. Or if it is it's a bug.
>>>>>The update cycle is this:
>>>>>1> updates get to the leader
>>>>>2> updates are sent to all followers and indexed on the leader as well
>>>>>3> each replica writes the updates to the local transaction log
>>>>>4> all the replicas ack back to the leader
>>>>>5> the leader responds to the client.
>>>>>At this point, all the replicas for the shard have the docs locally
>>>>>and can take over as leader.
>>>>>You may be confusing indexing in batches and having errors with
>>>>>updates getting to replicas. When you send a batch of docs to Solr,
>>>>>if one of them fails indexing some of the rest of the docs may not
>>>>>be indexed. See SOLR-445 for some work on this front.
>>>>>That said, bouncing servers willy-nilly during heavy indexing, especially
>>>>>if the indexer doesn't know enough to retry if an indexing attempt fails
>>>>>be the root cause here. Have you verified that your indexing program
>>>>>retries in the event of failure?
>>>>>On Mon, Apr 25, 2016 at 6:13 AM, tedsolr <> wrote:
>>>>>> I've done a bit of reading - found some other posts with similar
>>>>>> So I gather "Optimizing" a collection is rarely a good idea. It does
>>>>>> need to be condensed to a single segment. I also read that it's up
to the
>>>>>> client to keep track of updates in case commits don't happen on all
>>>>>> replicas. Solr will commit and return success as long as one replica
>>>>>> the update.
>>>>>> I have a state where the two replicas for one collection are out
of sync.
>>>>>> One has some updates that the other does not. And I don't have log
data to
>>>>>> tell me what the differences are. This happened during a maintenance
>>>>>> when the servers got restarted while a large index job was running.
>>>>>> this doesn't cause a problem, but it did last Thursday.
>>>>>> What I plan to do is select the replica I believe is incomplete and
>>>>>> it. Then add a new one. I was just hoping Solr had a solution for
this -
>>>>>> maybe using the ZK transaction logs to replay some updates, or force
>>>>>> resync between the replicas.
>>>>>> I will also implement a fix to prevent Solr from restarting unless
one of
>>>>>> its config files has changed. No need to bounce Solr just for kicks.
>>>>>> --
>>>>>> View this message in context:
>>>>>> Sent from the Solr - User mailing list archive at
View raw message