zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Han <h...@apache.org>
Subject Re: Re: a misunderstanding of ZAB
Date Thu, 05 Sep 2019 22:23:18 GMT
>> if he has p1, then p1 will be committed

to be more precise, this is only a yes if the new leader can deliver p1 to
quorum; otherwise we'll be back to case 1. basically what ben and
alex emphasized about the key point (deliver to quorum / quorum ack).

On Thu, Sep 5, 2019 at 1:26 PM Alexander Shraer <shralex@gmail.com> wrote:

> > the global state is neither COMMITTED nor DROPPED.
>
> Just like in Paxos, if a quorum ACKs (it does not matter whether L1
> received the acks before crashing) then its guaranteed not to be lost.
> If less then a quorum acked, them its unknown until recovery happens in
> which case it could be committed or dropped, depends on what L2 knows.
> The global state isn't known to any process though.
>
> > so for a client, if write query takes too much time, the client may
> receive Timeout Exception, and it must query servers again to know whether
> previous write is SUCCESS or FAIL?
> Yes. ZK doesn't currently have a good way of finding this out,
> https://issues.apache.org/jira/browse/ZOOKEEPER-22
>
>
> Alex
>
>
>
> On Wed, Sep 4, 2019 at 10:37 PM 121476721@qq.com <121476721@qq.com> wrote:
>
> > thank you, Michael. seems i got the idea.
> > in case2, when L1 fails before receiving a quorum's ACKs, the global
> state
> > is neither COMMITTED nor DROPPED.
> > Until a new leader elected and syncs to his followers, if he has p1,then
> > p1 will be committed; if he has not p1, then p1 will be dropped.
> > so for a client, if write query takes too much time, the client may
> > receive Timeout Exception, and it must query servers again to know
> whether
> > previous write is SUCCESS or FAIL?
> >
> >
> >
> > 121476721@qq.com
> >
> > From: Michael Han
> > Date: 2019-09-04 02:26
> > To: user
> > Subject: Re: a misunderstanding of ZAB
> > +1 with what Alex has said.
> >
> > The commit case is easy to understand. For skip case, think this example:
> >
> > old quorum: F1 F2 F3 F4 F5, with F1 as L1. L1 has p on F1 and F2.
> > new quorum: F1 F2 F3 F4 F5, with F3 as L2. It's possible, because
> although
> > F1 and F2 has latest zxid, they could be partitioned away and F3 F4 F5
> are
> > enough to form quorum to elect a new leader.
> >
> > Now partition healed, the commit of p on F1 and F2 should be dropped (in
> > ZK, this is what "TRUNC" sync is for).
> >
> > >> L2 become new leader, he should skip p1.
> >
> > If your L2 is F2 here, p1 will not be skipped, since p1 is available on
> F2
> > the new leader.
> >
> > On Tue, Sep 3, 2019 at 10:35 AM Alexander Shraer <shralex@gmail.com>
> > wrote:
> >
> > > In case2, it is possible that p1 is committed or dropped. It depends on
> > > whether L2 knows about p1.
> > > Note that L2 needs the support of a quorum to become leader, and in ZK
> > > since there is no state copy from followers to leader, the leader
> > candidate
> > > needs to have the longest log.
> > > So, if L2's log includes p1 it will be committed otherwise it will be
> > > dropped.
> > >
> > > In case1 L2's log necessarily includes p1 since it is present at a
> quorum
> > > and without having it in the log its not possible to have a log more
> > > up-to-date than that of a quorum / get the support of a quorum to
> become
> > > leader.
> > >
> > > Alex
> > >
> > >
> > > On Tue, Sep 3, 2019 at 4:52 AM Norbert Kalmar
> > <nkalmar@cloudera.com.invalid
> > > >
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > That's a good question. So if I understand correctly, you are asking
> > what
> > > > happens if there is a new Leader Election in ZooKeeper, what is the
> > "last
> > > > seen zxid". I checked the ZAB protocol, it is not entirely clear for
> me
> > > as
> > > > well, but my understanding is that the last seen zxid is the last
> > > > transaction, which is read from txnlogs in case of a recovery.
> > Honestly,
> > > > there's nothing else this could be read from. So if it hasn't been
> > > > committed to the datatree (and that exists in memory anyway, at least
> > > until
> > > > a snapshot is taken), it is still the last txn that is logged by one
> of
> > > the
> > > > followers, so he will win the Leader Election, and the followers will
> > get
> > > > this txn as well.
> > > > Anyone agree/disagree? :)
> > > >
> > > > Regards,
> > > > Norbert
> > > >
> > > > On Mon, Sep 2, 2019 at 4:50 AM 121476721@qq.com <121476721@qq.com>
> > > wrote:
> > > >
> > > > > hi, i'm a new to zookeeper, and this problem confuses me for nearly
> > two
> > > > > months...
> > > > > papers tell me that zab must satisfy:
> > > > > A message delivered by one sever must be delivered on quorum.
> > > > > A message skipped must always be skipped.
> > > > > Then consider two cases below, L is short for leader, F is short
> for
> > > > > follower, p is short for proposal.
> > > > > Case1:
> > > > > L send p1 to F2 F3 F4 F5.
> > > > > F2 F3 ack p1, reach a quorum.
> > > > > L1 is about to send commit but failed...
> > > > > L2 become new leader, he should commit.
> > > > >
> > > > > Case2:
> > > > > L1 send p1 to F2 F3 F4 F5.
> > > > > Only F2 ack p1, not reach a quorum.
> > > > > Then L1 failed...
> > > > > L2 become new leader, he should skip p1.
> > > > >
> > > > > i think L2 should handle the cases in election&recovery phase.
but
> > how
> > > L2
> > > > > can know the global state and decide if commit p1 or skip p1?
> > > > > if anyone helps, i will be much appreciate.
> > > > >
> > > > >
> > > > >
> > > > > 121476721@qq.com
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message