zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cheetah <xuw...@gmail.com>
Subject Re: How zab avoid split-brain problem?
Date Tue, 30 Aug 2011 23:20:17 GMT
I see. This makes sense to me now. Thanks.

Looking forward to this feature.

Regards,
Peter

On Tue, Aug 30, 2011 at 4:04 PM, Alexander Shraer <shralex@yahoo-inc.com>wrote:

> Hi Peter,
>
> We're currently working on adding dynamic reconfiguration functionality to
> Zookeeper. I hope that it will get in to the next release of ZK (after 3.4).
> With this you'll just run a new zk command to add/remove any servers, change
> ports, change roles (followers/observers), etc.
>
> Currently, membership is determined by the config file so the only way of
> doing this is "rolling restart". This means that you change configuration
> files and bounce the servers back. You should do it in a way that guarantees
> that at any time any quorum of the servers that are up intersects with any
> quorum of the old configuration (otherwise you might lose data). For
> example, if you're going from (A, B, C) to (A, B, C, D, E), it is possible
> that A and B have the latest data whereas C is falling behind (ZK stores
> data on a quorum), so if you just change the config files of A, B, C to say
> that they are part of the larger configuration then C might be elected with
> the support of D and E and you might lose data. So in this case you'll have
> to first add D, and later add E, this way the quorums intersect. Same thing
> when removing servers.
>
> Alex
>
> > -----Original Message-----
> > From: cheetah [mailto:xuwh06@gmail.com]
> > Sent: Tuesday, August 30, 2011 3:36 PM
> > To: dev@zookeeper.apache.org
> > Cc: user@zookeeper.apache.org
> > Subject: Re: How zab avoid split-brain problem?
> >
> > Hi Alex,
> >
> > Thanks for the explanation.
> >
> > Then I have another question:
> >
> > If there are 7 machines in my current zookeeper clusters, two of them
> > are
> > failed. How can I reconfigure the Zookeeper to make it working with 5
> > machines? i.e if the master can get 3 machines' reply, it can commit
> > the
> > transaction.
> >
> > On the other hand, if I add 2 machines to make a 9 node Zookeeper
> > cluster,
> > how can I configure it to make it taking advantages of 9 machines?
> >
> > This is more related to user mailing list. So I cc to it.
> >
> > Thanks,
> > Peter
> >
> > On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer <shralex@yahoo-
> > inc.com>wrote:
> >
> > > Hi Peter,
> > >
> > > It's the second option. The servers don't know if the leader failed
> > or
> > > was partitioned from them. So each group of 3 servers in your
> > scenario
> > > can't distinguish the situation from another scenario where none of
> > the
> > > servers
> > > failed but these 3 servers are partitioned from the other 4. To
> > prevent a
> > > split brain
> > > in an asynchronous network a leader must have the support of a
> > quorum.
> > >
> > > Alex
> > >
> > > > -----Original Message-----
> > > > From: cheetah [mailto:xuwh06@gmail.com]
> > > > Sent: Tuesday, August 30, 2011 12:23 AM
> > > > To: dev@zookeeper.apache.org
> > > > Subject: How zab avoid split-brain problem?
> > > >
> > > > Hi folks,
> > > >     I am reading the zab paper, but a bit confusing how zab handle
> > > > split
> > > > brain problem.
> > > >     Suppose there are A, B, C, D, E, F and G seven servers, now A
> > is
> > > > the
> > > > leader. When A dies and at the same time, B,C,D are isolated from
> > E, F
> > > > and
> > > > G.
> > > >      In this case, will Zab continue working like this: if B>C>D
> > and
> > > > E>F>G,
> > > > so the two groups are both voting and electing B and E as their
> > leaders
> > > > separately. Thus, there is a split brain problem.
> > > >      Or Zookeeper just stop working, because there were original 7
> > > > servers,
> > > > after 1 failure, a new leader still expects to have a quorum of 3
> > > > servers
> > > > voting for it as the leader. And because the two groups are
> > separate
> > > > from
> > > > each other, no leader can be elected out.
> > > >
> > > >       If it is the first case, Zookeeper will have a split brain
> > > > problem,
> > > > which probably is not the case. But in the second case, a 7-node
> > > > Zookeeper
> > > > service can only handle a node failure and a network partition
> > failure.
> > > >
> > > >      Am I understanding wrongly? Looking forward to your insights.
> > > >
> > > > Thanks,
> > > > Peter
> > >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message