zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: question on ZAB protocol
Date Wed, 13 Jul 2011 02:59:39 GMT
I guess by asking what happens in the following scenarios, I'd
understand it better:

let's say we have nodes A B C D E, now A is the leader

A broadcasts <1,1>,  it reaches B, then A, B die, C D E elect someone,
the new system is going to throw away <1,1> since it does not know its
existence, right?

start from scratch,
A broadcasts<1,1> , it reaches all, all send ACK to A, but A dies
before receiving the ACK, then BCDE elects someone, and the new leader
sees <1,1> in log, so it broadcasts <1,1> to BCDE, which all commit
it.  now if we look back, when A dies, the client should get a "write
failure", but now after BCDE relection, the written value does get
into the system ???


I kind of see another possible reason for the quorum:

when the new leader is elected, it needs to find out the longest
history of  all followers, and bring everybody up to date to that
longest history.
if every write reaches a quorum, then a quorum during election is
guaranteed to give the new leader the full history.

but against the above argument, if old leader gets quorum, broadcasts
a COMMIT to all, but COMMIT reaches only 1 of them, and then both
then the Quorum dies, and the follower that has the COMMIT just served
the new data to client, then it dies too,
now the election process is going find a history missing the last
written record.
so some written records are going to be lost anyway,

On Tue, Jul 12, 2011 at 6:45 PM, Yang <teddyyyy123@gmail.com> wrote:
> I read the ZAB paper before, and never realized this question, but
> find out today that I can't answer why, so I'm bringing it up here.
> according to the paper
> B. Reed and F. P. Junqueira. A simple totally ordered broadcast
> protocol. In LADIS ’08: Proceedings of the 2nd Workshop
> on Large-Scale Distributed Systems and Middleware, pages 1–6,
> New York, NY, USA, 2008. ACM.
> the leader broadcasts a write to all replicas, and then waits for a
> quorum to reply, before sending out the COMMIT.
> why is the quorum necessary (i.e. why can't the leader just wait for
> one reply and start sending the COMMIT?)??
> now that I think about it, it seems that waiting for just one reply is
> enough, because the connection from leader to replicas are FIFO, as
> long as the replicas do not die,
> they will eventually get the writes, even though the writes arrive at
> them after the leader starts the COMMIT.
> the only reason I can think of  for using a quorum is to tolerate more
> failures: if the only replied replica  dies, and leader dies, then we
> lose that  latest write.
> by requiring f ACKs, you can tolerate f-1 failures. but then you don't
> really need 2f+1 nodes in the ZK cluster, just f+1 is enough.
> Thanks a lot
> Yang

View raw message