Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 67C968F85 for ; Tue, 30 Aug 2011 23:20:45 +0000 (UTC) Received: (qmail 53557 invoked by uid 500); 30 Aug 2011 23:20:45 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 53506 invoked by uid 500); 30 Aug 2011 23:20:44 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 53498 invoked by uid 99); 30 Aug 2011 23:20:44 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Aug 2011 23:20:44 +0000 X-ASF-Spam-Status: No, hits=1.6 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of xuwh06@gmail.com designates 209.85.215.176 as permitted sender) Received: from [209.85.215.176] (HELO mail-ey0-f176.google.com) (209.85.215.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 30 Aug 2011 23:20:39 +0000 Received: by eyz10 with SMTP id 10so108188eyz.21 for ; Tue, 30 Aug 2011 16:20:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=bNJl4CMiG/8481Qbh8dIMssSvML1v/pEzEirN/DEk7I=; b=kB5zXDDEnOjpRbgY+uw1N0pjeMXpFj/Sug7RDV/0JTSLFG23MnSeVqCRLCZgYMuuZp OoQiQc+YySSpRMvUTkdEUHbdc3GFm4yp6r4G5aZFqSmppYDYfHzAiTYcICsoJ1MY8yo8 z3ktQVlJBkfQCs4KagPD82b7J4WLQ7AVHak5I= MIME-Version: 1.0 Received: by 10.213.30.7 with SMTP id s7mr2929822ebc.63.1314746417864; Tue, 30 Aug 2011 16:20:17 -0700 (PDT) Received: by 10.213.28.148 with HTTP; Tue, 30 Aug 2011 16:20:17 -0700 (PDT) In-Reply-To: References: Date: Tue, 30 Aug 2011 16:20:17 -0700 Message-ID: Subject: Re: How zab avoid split-brain problem? From: cheetah To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=0015174bdf2e34e13c04abc14134 --0015174bdf2e34e13c04abc14134 Content-Type: text/plain; charset=ISO-8859-1 I see. This makes sense to me now. Thanks. Looking forward to this feature. Regards, Peter On Tue, Aug 30, 2011 at 4:04 PM, Alexander Shraer wrote: > Hi Peter, > > We're currently working on adding dynamic reconfiguration functionality to > Zookeeper. I hope that it will get in to the next release of ZK (after 3.4). > With this you'll just run a new zk command to add/remove any servers, change > ports, change roles (followers/observers), etc. > > Currently, membership is determined by the config file so the only way of > doing this is "rolling restart". This means that you change configuration > files and bounce the servers back. You should do it in a way that guarantees > that at any time any quorum of the servers that are up intersects with any > quorum of the old configuration (otherwise you might lose data). For > example, if you're going from (A, B, C) to (A, B, C, D, E), it is possible > that A and B have the latest data whereas C is falling behind (ZK stores > data on a quorum), so if you just change the config files of A, B, C to say > that they are part of the larger configuration then C might be elected with > the support of D and E and you might lose data. So in this case you'll have > to first add D, and later add E, this way the quorums intersect. Same thing > when removing servers. > > Alex > > > -----Original Message----- > > From: cheetah [mailto:xuwh06@gmail.com] > > Sent: Tuesday, August 30, 2011 3:36 PM > > To: dev@zookeeper.apache.org > > Cc: user@zookeeper.apache.org > > Subject: Re: How zab avoid split-brain problem? > > > > Hi Alex, > > > > Thanks for the explanation. > > > > Then I have another question: > > > > If there are 7 machines in my current zookeeper clusters, two of them > > are > > failed. How can I reconfigure the Zookeeper to make it working with 5 > > machines? i.e if the master can get 3 machines' reply, it can commit > > the > > transaction. > > > > On the other hand, if I add 2 machines to make a 9 node Zookeeper > > cluster, > > how can I configure it to make it taking advantages of 9 machines? > > > > This is more related to user mailing list. So I cc to it. > > > > Thanks, > > Peter > > > > On Tue, Aug 30, 2011 at 12:21 PM, Alexander Shraer > inc.com>wrote: > > > > > Hi Peter, > > > > > > It's the second option. The servers don't know if the leader failed > > or > > > was partitioned from them. So each group of 3 servers in your > > scenario > > > can't distinguish the situation from another scenario where none of > > the > > > servers > > > failed but these 3 servers are partitioned from the other 4. To > > prevent a > > > split brain > > > in an asynchronous network a leader must have the support of a > > quorum. > > > > > > Alex > > > > > > > -----Original Message----- > > > > From: cheetah [mailto:xuwh06@gmail.com] > > > > Sent: Tuesday, August 30, 2011 12:23 AM > > > > To: dev@zookeeper.apache.org > > > > Subject: How zab avoid split-brain problem? > > > > > > > > Hi folks, > > > > I am reading the zab paper, but a bit confusing how zab handle > > > > split > > > > brain problem. > > > > Suppose there are A, B, C, D, E, F and G seven servers, now A > > is > > > > the > > > > leader. When A dies and at the same time, B,C,D are isolated from > > E, F > > > > and > > > > G. > > > > In this case, will Zab continue working like this: if B>C>D > > and > > > > E>F>G, > > > > so the two groups are both voting and electing B and E as their > > leaders > > > > separately. Thus, there is a split brain problem. > > > > Or Zookeeper just stop working, because there were original 7 > > > > servers, > > > > after 1 failure, a new leader still expects to have a quorum of 3 > > > > servers > > > > voting for it as the leader. And because the two groups are > > separate > > > > from > > > > each other, no leader can be elected out. > > > > > > > > If it is the first case, Zookeeper will have a split brain > > > > problem, > > > > which probably is not the case. But in the second case, a 7-node > > > > Zookeeper > > > > service can only handle a node failure and a network partition > > failure. > > > > > > > > Am I understanding wrongly? Looking forward to your insights. > > > > > > > > Thanks, > > > > Peter > > > > --0015174bdf2e34e13c04abc14134--