zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ibrahim El-sanosi (PGR)" <i.s.el-san...@newcastle.ac.uk>
Subject RE: 3-server Zab cluster
Date Mon, 05 Oct 2015 17:13:33 GMT
Hi Rakesh,

In Zab, before the end of synchronization phase, new leader will not commit any proposals
in transaction logs that have not got a majority of acks from pervious ensemble  (that what
you are saying).

I think what Zab does is that before the end of synchronization phase,  in L and F2 (the new
quorum), L (a prospective leader) will sync its own state with F2 as the initial state.  Referring
to my scenario, zxid =10 is part of the initial state and as a result it will be delivered
in new quorum (L and F2) before  processing new proposals of new epoch.

You can read this thread http://zookeeper-user.578899.n2.nabble.com/Zab-Failure-scenario-td7581583.html
for more info

What do you think? Does anyone have any questions or concerns about such (small) optimization?

Ibrahim

From: Rakesh Radhakrishnan [mailto:rakeshr.apache@gmail.com]
Sent: Thursday, October 01, 2015 06:15 م
To: Ibrahim El-sanosi (PGR)
Subject: Re: 3-server Zab cluster

>>>>>>>>(***) Ok, I thought when F2 form a quorum with L and  before
serving clients, L synchronizes its state with F2, resulting in zxid=10 will be committed
in L and F2 as well. I also though this process is the same as Zab, isn't it?

Since L didn't receives any ACK responses from F1 or F2 before leaving the Leader status previously,
L won't commit transaction zxid=10. IIUC after re-forming the new quorum L will not have any
mechanism to re-initiate the proposal(Active messaging phase) for the previous zxid=10.

-Rakesh

On Thu, Oct 1, 2015 at 10:19 PM, Ibrahim El-sanosi (PGR) <i.s.el-sanosi@newcastle.ac.uk<mailto:i.s.el-sanosi@newcastle.ac.uk>>
wrote:
Thank you Rakesh.

>>>In your case, zk client sees a successful response from F1. Then assume F2 >>>joins
quorum first and L become the leader again. But the newly formed >>>quorum will not
have the zxid=10 transaction. This will make the cluster >>>inconsistent, isn't it?

(***) Ok, I thought when F2 form a quorum with L and  before serving clients, L synchronizes
its state with F2, resulting in zxid=10 will be committed in L and F2 as well. I also though
this process is the same as Zab, isn't it?


>>>Apart from the above case I'm not seeing any other problems with 3 node >>>cluster.
The above data loss case can be avoided by putting an assumption >>>that more than
a tolerated number of server failures may affect the cluster >>>consistency and results
in data loss.

Yes, if the solution above (***) is not correct, you assumption makes sense.

Ibrahim

From: Rakesh Radhakrishnan [mailto:rakeshr.apache@gmail.com<mailto:rakeshr.apache@gmail.com>]
Sent: 01 October 2015 17:26
To: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>; Ibrahim El-sanosi (PGR)

Subject: Re: 3-server Zab cluster

Hi Ibrahim,

Below example taken from your older mail thread.

>>>>> 1. leader  (L)  sends a proposal p with zxid =10 to F1 and F2.
>>>>> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2
crashes before receiving P10. L has not received any ACKs

My thoughts for the above scenario is,

In your case, zk client sees a successful response from F1. Then assume F2 joins quorum first
and L become the leader again. But the newly formed quorum will not have the zxid=10 transaction.
This will make the cluster inconsistent, isn't it?

Apart from the above case I'm not seeing any other problems with 3 node cluster. The above
data loss case can be avoided by putting an assumption that more than a tolerated number of
server failures may affect the cluster consistency and results in data loss. But I feel this
optimization would have more cases if we scale up the cluster size beyond 3 servers. Now,
I'm not thinking in that direction as your case is limited to 3 node cluster.

Regards,
Rakesh


On Tue, Sep 29, 2015 at 2:28 PM, Ibrahim El-sanosi (PGR) <i.s.el-sanosi@newcastle.ac.uk<mailto:i.s.el-sanosi@newcastle.ac.uk>>
wrote:
Yes Alex, in my post I mentioned that this (small) optimization can only work with 3-servers
cluster.

Who could confirm the optimization can work?

Ibrahim

-----Original Message-----
From: Alexander Shraer [mailto:shralex@gmail.com<mailto:shralex@gmail.com>]
Sent: Tuesday, September 29, 2015 12:11 ص
To: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>
Subject: Re: 3-server Zab cluster

I'm not 100% sure whether operations that were pending on the leader are sent out during sync
when this leader looses quorum and re-elected. If so, then maybe you're right. But in any
case, this would not work for 5 or more servers...

On Mon, Sep 28, 2015 at 3:51 PM, Ibrahim El-sanosi (PGR) < i.s.el-sanosi@newcastle.ac.uk<mailto:i.s.el-sanosi@newcastle.ac.uk>>
wrote:

> Thank you Alex for replaying.
>
> When you said " the leader gets re-elected and the operation is
> truncated from logs at other servers". I though the new leader will
> sync the its logs with other followers (synchronization phase),
> resulting in the operation will commit by new quorum.  Let me make the scenarios as steps:
>
> 1. leader  (L)  sends a proposal p with zxid =10 to F1 and F2.
> 2. F1 logs, sends an ACK, commits, replays to clients and crashes. F2
> crashes before receiving P10. L has not received any ACKs
>
> Possible solution  (1)
> The leader will move to LOOKING phase as there is no quorum supporting
> its leadership. Now Assume F2 wakes up. F2 forms a quorum with the L
> (pervious leader), L becomes new leader again as it has latest zxid (10) in its log.
> L syncs its state with F2, as a result L, F1 (before crashing) and F2
> commit P10.  Is that correct?
>
> Possible solution  (2)
> The leader will move to LOOKING phase as there is no quorum supporting
> its leadership. Now Assume F1 (with Zxid =10  committed) wakes up. I
> am not sure who should be a leader (F1 with Zxid =10 committed or L
> (pervious
> leader) with Zxid = 10 logged), I think F1 become a new leader as it
> has Zxid = 10 committed. F1 forms a quorum with the L (pervious
> leader), F1 becomes new leader as it has latest zxid (10) . L (new
> leader) syncs its state with L (pervious leader now become a
> follower), as a result Zxid10 commits by new quorum.  Is that correct?
>
> What do you think?
>
> Ibrahim
>
>
>
>
>
> -----Original Message-----
> From: Alexander Shraer [mailto:shralex@gmail.com<mailto:shralex@gmail.com>]
> Sent: Monday, September 28, 2015 07:27 م
> To: user@zookeeper.apache.org<mailto:user@zookeeper.apache.org>
> Cc: dev@zookeeper.apache.org<mailto:dev@zookeeper.apache.org>
> Subject: Re: 3-server Zab cluster
>
> Committing locally when sending an ACK at a server would lead to loss
> of consistency - it is possible that this is the only server that
> acks, e.g., this server is temporarily disconnected from the leader,
> the leader gets re-elected and the operation is truncated from logs at
> other servers. Its ok to ACK it but its not ok to commit since this
> exposes this to users as a committed operation that they can see.
>
> On Mon, Sep 28, 2015 at 4:19 AM, Ibrahim El-sanosi (PGR) <
> i.s.el-sanosi@newcastle.ac.uk<mailto:i.s.el-sanosi@newcastle.ac.uk>> wrote:
>
> > In Zab, assume we have a cluster consists of 3-servers. To deliver a
> > write request, it must run 3 communication steps proposal,
> > acknowledgement and commit.
> > As Zab uses reliable FIFO, it is possible to remove commit round. As
> > soon as a follower receives a proposal, it logs, sends an ACK and
> > commits locally. Upon receiving ACK from any follower, leader
> > commits a proposal locally, no COMMIT message need to be sent to
> > followers. In this case, all servers commit a proposal in two
> > round-trips, resulting in reducing latency particularly in followers.
> >
> > Note that this optimization can only work in 3-servers cluster
> > (follower reaches a majority as soon as it acks).
> > Does anyone see any problems with such (small) optimization?
> > Ibrahim
> >
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message