zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Reed (JIRA)" <j...@apache.org>
Subject [jira] Updated: (ZOOKEEPER-962) leader/follower coherence issue when follower is receiving a DIFF
Date Sat, 22 Jan 2011 01:02:43 GMT

     [ https://issues.apache.org/jira/browse/ZOOKEEPER-962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Benjamin Reed updated ZOOKEEPER-962:

+1 running final tests, and will then commit.

> leader/follower coherence issue when follower is receiving a DIFF
> -----------------------------------------------------------------
>                 Key: ZOOKEEPER-962
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-962
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.3.2
>            Reporter: Camille Fournier
>            Assignee: ChiaHung Lin
>            Priority: Critical
>             Fix For: 3.3.3, 3.4.0
>         Attachments: ZOOKEEPER-962.patch, ZOOKEEPER-962_2.patch, ZOOKEEPER-962_3.patch,
ZOOKEEPER-962_4.patch, ZOOKEEPER-962_5.patch, ZOOKEEPER-962_6.patch
> From mailing list:
> It seems like we rely on the LearnerHandler thread startup to capture all of the missing
> transactions in the SNAP or DIFF, but I don't see anything (especially in the DIFF case)
> is preventing us for committing more transactions before we actually start forwarding
> to the new follower.
> Let me explain using my example from ZOOKEEPER-919. Assume we have quorum already, so
> leader can be processing transactions while my follower is starting up.
> I'm a follower at zxid N-5, the leader is at N. I send my FOLLOWERINFO packet to the
> with that information. The leader gets the proposals from its committed log (time T1),
> syncs on the proposal list (LearnerHandler line 267. Why? It's a copy of the underlying
> list... this might be part of our problem). I check to see if the peerLastZxid is within
> max and min committed log and it is, so I'm going to send a diff. I set the zxidToSend
> be the maxCommittedLog at time T3 (we already know this is sketchy), and forward the
> from my copied proposal list starting at the peerLastZxid+1 up to the last proposal transaction
> (as seen at time T1).
> After I have queued up all those diffs to send, I tell the leader to startFowarding updates
> to this follower (line 308). 
> So, let's say that at time T2 I actually swap out the leader to the thread that is handling
> the various request processors, and see that I got enough votes to commit zxid N+1. I
> N+1 and so my maxCommittedLog at T3 is N+1, but this proposal is not in the list of proposals
> that I got back at time T1, so I don't forward this diff to the client. Additionally,
I processed
> the commit and removed it from my leader's toBeApplied list. So when I call startForwarding
> for this new follower, I don't see this transaction as a transaction to be forwarded.

> There's one problem. Let's also imagine, however, that I commit N+1 at time T4. The maxCommittedLog
> value is consistent with the max of the diff packets I am going to send the follower.
> I still committed N+1 and removed it from the toBeApplied list before calling startFowarding
> with this follower. How does the follower get this transaction? Does it?
> To put it another way, here is the thread interaction, hopefully formatted so you can
> it...
> 		LearnerHandlerThread					RequestProcessorThread
> T1(LH):	get list of proposals (COPY)
> T2(RPT):								commit N+1, remove from toBeApplied
> T3(LH):	get maxCommittedLog
> T4(LH):	send diffs from view at T1
> T5(LH):	startForwarding
> Or
> T1(LH):	get list of proposals (COPY)
> T2(LH):	get maxCommittedLog
> T3(RPT):								commit N+1, remove from toBeApplied
> T4(LH):	send diffs from view at T1
> T5(LH):	startFowarding
> I'm trying to figure out what, if anything, keeps the requests from being committed,
> and never seen by the follower before it fully starts up. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message