zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "121476721@qq.com" <121476...@qq.com>
Subject Re: Re: a misunderstanding of ZAB
Date Thu, 05 Sep 2019 05:36:48 GMT
thank you, Michael. seems i got the idea.
in case2, when L1 fails before receiving a quorum's ACKs, the global state is neither COMMITTED
nor DROPPED. 
Until a new leader elected and syncs to his followers, if he has p1,then p1 will be committed;
if he has not p1, then p1 will be dropped.
so for a client, if write query takes too much time, the client may receive Timeout Exception,
and it must query servers again to know whether previous write is SUCCESS or FAIL?



121476721@qq.com
 
From: Michael Han
Date: 2019-09-04 02:26
To: user
Subject: Re: a misunderstanding of ZAB
+1 with what Alex has said.
 
The commit case is easy to understand. For skip case, think this example:
 
old quorum: F1 F2 F3 F4 F5, with F1 as L1. L1 has p on F1 and F2.
new quorum: F1 F2 F3 F4 F5, with F3 as L2. It's possible, because although
F1 and F2 has latest zxid, they could be partitioned away and F3 F4 F5 are
enough to form quorum to elect a new leader.
 
Now partition healed, the commit of p on F1 and F2 should be dropped (in
ZK, this is what "TRUNC" sync is for).
 
>> L2 become new leader, he should skip p1.
 
If your L2 is F2 here, p1 will not be skipped, since p1 is available on F2
the new leader.
 
On Tue, Sep 3, 2019 at 10:35 AM Alexander Shraer <shralex@gmail.com> wrote:
 
> In case2, it is possible that p1 is committed or dropped. It depends on
> whether L2 knows about p1.
> Note that L2 needs the support of a quorum to become leader, and in ZK
> since there is no state copy from followers to leader, the leader candidate
> needs to have the longest log.
> So, if L2's log includes p1 it will be committed otherwise it will be
> dropped.
>
> In case1 L2's log necessarily includes p1 since it is present at a quorum
> and without having it in the log its not possible to have a log more
> up-to-date than that of a quorum / get the support of a quorum to become
> leader.
>
> Alex
>
>
> On Tue, Sep 3, 2019 at 4:52 AM Norbert Kalmar <nkalmar@cloudera.com.invalid
> >
> wrote:
>
> > Hi,
> >
> > That's a good question. So if I understand correctly, you are asking what
> > happens if there is a new Leader Election in ZooKeeper, what is the "last
> > seen zxid". I checked the ZAB protocol, it is not entirely clear for me
> as
> > well, but my understanding is that the last seen zxid is the last
> > transaction, which is read from txnlogs in case of a recovery. Honestly,
> > there's nothing else this could be read from. So if it hasn't been
> > committed to the datatree (and that exists in memory anyway, at least
> until
> > a snapshot is taken), it is still the last txn that is logged by one of
> the
> > followers, so he will win the Leader Election, and the followers will get
> > this txn as well.
> > Anyone agree/disagree? :)
> >
> > Regards,
> > Norbert
> >
> > On Mon, Sep 2, 2019 at 4:50 AM 121476721@qq.com <121476721@qq.com>
> wrote:
> >
> > > hi, i'm a new to zookeeper, and this problem confuses me for nearly two
> > > months...
> > > papers tell me that zab must satisfy:
> > > A message delivered by one sever must be delivered on quorum.
> > > A message skipped must always be skipped.
> > > Then consider two cases below, L is short for leader, F is short for
> > > follower, p is short for proposal.
> > > Case1:
> > > L send p1 to F2 F3 F4 F5.
> > > F2 F3 ack p1, reach a quorum.
> > > L1 is about to send commit but failed...
> > > L2 become new leader, he should commit.
> > >
> > > Case2:
> > > L1 send p1 to F2 F3 F4 F5.
> > > Only F2 ack p1, not reach a quorum.
> > > Then L1 failed...
> > > L2 become new leader, he should skip p1.
> > >
> > > i think L2 should handle the cases in election&recovery phase. but how
> L2
> > > can know the global state and decide if commit p1 or skip p1?
> > > if anyone helps, i will be much appreciate.
> > >
> > >
> > >
> > > 121476721@qq.com
> > >
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message