zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cheetah <xuw...@gmail.com>
Subject Re: How zab avoid split-brain problem?
Date Tue, 30 Aug 2011 22:36:06 GMT
Hi Alex,

Thanks for the explanation.

Then I have another question:

If there are 7 machines in my current zookeeper clusters, two of them are
failed. How can I reconfigure the Zookeeper to make it working with 5
machines? i.e if the master can get 3 machines' reply, it can commit the
transaction.

On the other hand, if I add 2 machines to make a 9 node Zookeeper cluster,
how can I configure it to make it taking advantages of 9 machines?

This is more related to user mailing list. So I cc to it.

Thanks,
Peter

On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <shralex@yahoo-inc.com>wrote:

> Hi Peter,
>
> It's the second option. The servers don't know if the leader failed or
> was partitioned from them. So each group of 3 servers in your scenario
> can't distinguish the situation from another scenario where none of the
> servers
> failed but these 3 servers are partitioned from the other 4. To prevent a
> split brain
> in an asynchronous network a leader must have the support of a quorum.
>
> Alex
>
> > -----Original Message-----
> > From: cheetah [mailto:xuwh06@gmail.com]
> > Sent: Tuesday, August 30, 2011 12:23 AM
> > To: dev@zookeeper.apache.org
> > Subject: How zab avoid split-brain problem?
> >
> > Hi folks,
> >     I am reading the zab paper, but a bit confusing how zab handle
> > split
> > brain problem.
> >     Suppose there are A, B, C, D, E, F and G seven servers, now A is
> > the
> > leader. When A dies and at the same time, B,C,D are isolated from E, F
> > and
> > G.
> >      In this case, will Zab continue working like this: if B>C>D and
> > E>F>G,
> > so the two groups are both voting and electing B and E as their leaders
> > separately. Thus, there is a split brain problem.
> >      Or Zookeeper just stop working, because there were original 7
> > servers,
> > after 1 failure, a new leader still expects to have a quorum of 3
> > servers
> > voting for it as the leader. And because the two groups are separate
> > from
> > each other, no leader can be elected out.
> >
> >       If it is the first case, Zookeeper will have a split brain
> > problem,
> > which probably is not the case. But in the second case, a 7-node
> > Zookeeper
> > service can only handle a node failure and a network partition failure.
> >
> >      Am I understanding wrongly? Looking forward to your insights.
> >
> > Thanks,
> > Peter
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message