zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@yahoo-inc.com>
Subject RE: what would happen with this case ? (ZAB protocol question)
Date Thu, 21 Jul 2011 20:11:20 GMT
I think you're right - there is a bug here. 

As I mentioned, when a server starts-up it locally commits all ops it has ever received (see
ZKDataBase.loadDataBase). More importantly - the same happens in the Leader.lead() method
(zk.loadData()). So when execution reaches the code you quoted maxCommittedLog reflects all
transactions this leader has seen before becoming a leader, and everything works. In your
scenario everyone see the same set of transactions, so there is no problem.

The problem is in leader election - if the server doesn't reboot before running leader election
(the usual case)  then only the transactions for which it received a commit count and it might
not be elected leader, even if it has seen more transactions than the others. This may lead
to transactions being dropped. 

I opened a JIRA for this.

Thanks,
Alex

> -----Original Message-----
> From: Yang [mailto:teddyyyy123@gmail.com]
> Sent: Thursday, July 21, 2011 11:12 AM
> To: Alexander Shraer
> Subject: Re: what would happen with this case ? (ZAB protocol question)
> 
> "Any operation that was truly committed (acked by majority), will be
> known to one of the servers participating in the leader election"
> ------ this is where I'm having difficulty: in the example I gave, the
> commit on the dead leader is "Known/seen" by surviving nodes, but the
> code snippet I showed seems to suggest that only seen COMMITTED txns
> are replayed from new leader, not the seen transactions.
> 
> 
> thanks
> Yang
> 
> 
> 
> On Thu, Jul 21, 2011 at 11:04 AM, Alexander Shraer
> <shralex@yahoo-inc.com> wrote:
> > Hi,
> >
> > If I understand it correctly, when a server starts-up it locally
> commits all ops it has ever received (see ZKDataBase.loadDataBase) .
> Leader election then chooses the node that has the most ops committed
> to be the leader. It is possible that a minority of servers are down
> during leader election, but a majority (or quorum) do participate in
> leader election. Any operation that was truly committed (acked by
> majority), will be known to one of the servers participating in the
> leader election, so the elected leader will at least know all truly
> committed ops. If a server wakes up later and connects to this leader,
> his log is truncated to match the leader's. But this is safe to do,
> because as explained above none of the truncated ops could have been
> previously acked by a quorum.
> >
> > Alex
> >
> >
> >
> >> -----Original Message-----
> >> From: Yang [mailto:teddyyyy123@gmail.com]
> >> Sent: Wednesday, July 20, 2011 12:29 AM
> >> To: user@zookeeper.apache.org
> >> Subject: Re: what would happen with this case ? (ZAB protocol
> question)
> >>
> >> I found that my question is basically the same as
> >>
> >> http://zookeeper-user.578899.n2.nabble.com/Q-about-ZK-internal-how-
> >> commit-is-being-remembered-td4464847.html
> >>
> >> but reading that thread still leaves me unclear as to my original
> >> question.
> >>
> >> the following snippet from LearnerHandler.run() seems to be what the
> >> newly-elected leader is doing, basically bringing up every follower
> to
> >> its max committed proposal, and discard the rest.
> >> ---- if this is a correct understanding, then the P1 commit in my
> >> original question seems to be lost. ??
> >>
> >> Thanks
> >> Yang
> >>
> >>
> >>
> >>                 final long maxCommittedLog =
> >> leader.zk.getZKDatabase().getmaxCommittedLog();
> >>                 final long minCommittedLog =
> >> leader.zk.getZKDatabase().getminCommittedLog();
> >>                 LinkedList<Proposal> proposals =
> >> leader.zk.getZKDatabase().getCommittedLog();
> >>                 if (proposals.size() != 0) {
> >>                     if ((maxCommittedLog >= peerLastZxid)
> >>                             && (minCommittedLog <=
peerLastZxid)) {
> >>                         packetToSend = Leader.DIFF;
> >>                         zxidToSend = maxCommittedLog;
> >>                         for (Proposal propose: proposals) {
> >>                             if (propose.packet.getZxid() >
> >> peerLastZxid) {
> >>                                 queuePacket(propose.packet);
> >>                                 QuorumPacket qcommit = new
> >> QuorumPacket(Leader.COMMIT, propose.packet.getZxid(),
> >>                                         null, null);
> >>                                 queuePacket(qcommit);
> >>                             }
> >>                         }
> >>                     } else if (peerLastZxid > maxCommittedLog)
{
> >>                         packetToSend = Leader.TRUNC;
> >>                         zxidToSend = maxCommittedLog;
> >>                         updates = zxidToSend;
> >>                     }
> >>                 } else {
> >>                     // just let the state transfer happen
> >>                 }
> >>
> >> On Tue, Jul 19, 2011 at 2:44 PM, Yang <teddyyyy123@gmail.com> wrote:
> >> > like the first figure in the ZAB paper described,
> >> > say we have node A B C, A is leader now
> >> >
> >> > all 3 nodes see proposals P1, P2, an all acked both,
> >> > A sees acks for P1, and commits it, but right after this A dies.
> >> >
> >> > now B is elected, B does not see any commit, so (according to my
> >> > possibly wrong understanding from the code)
> >> > B throws away P1 P2, and starts a new epoch.
> >> > is this the current behavior of code?
> >> >
> >> > but then the commit of P1 on A is lost?
> >> >
> >> > Thanks
> >> > Yang
> >> >
> >

Mime
View raw message