Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DAC2B1019B for ; Wed, 27 Nov 2013 00:58:13 +0000 (UTC) Received: (qmail 18888 invoked by uid 500); 27 Nov 2013 00:58:13 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 18856 invoked by uid 500); 27 Nov 2013 00:58:13 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 18848 invoked by uid 99); 27 Nov 2013 00:58:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Nov 2013 00:58:13 +0000 X-ASF-Spam-Status: No, hits=3.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mckenzie.cam@gmail.com designates 209.85.192.179 as permitted sender) Received: from [209.85.192.179] (HELO mail-pd0-f179.google.com) (209.85.192.179) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 27 Nov 2013 00:58:08 +0000 Received: by mail-pd0-f179.google.com with SMTP id r10so8737987pdi.24 for ; Tue, 26 Nov 2013 16:57:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=3K5VuzBR+iC+I0JIlfg+6EMX2mzpeTKZSgo4vaq22mA=; b=ReZI7nCVCUF8hFL9fjeh//gBp+NWKCm9qHycNuPzOfs/JaVhlGZ4wlUY2LAg38r+uA T12BP5TN2V4U+ewe3WyYhBGCOzuJ+cX+kZck06AzT3Q9i2vKSaqDYQRA66A6QXHxf2j8 aJ5k832lQQRoWWjgRQ+SXhWJBCmILlU6JH/Rul+Gd628Ct8sB0owUsivUhWWG8buXvRp o2rtUHcuO62HoPySo5APr6NM+spgDqWSVfBMT/tK415YS6rPTMj+bkJfIU8vKOoLaJnT WXqr81Xgtcok42A5zofhhI7DUd8FgTvNb8fIypYzXps5OjZxb+6hxV8w/HC3c0o2ugxg jeqA== MIME-Version: 1.0 X-Received: by 10.66.154.1 with SMTP id vk1mr38803935pab.85.1385513868228; Tue, 26 Nov 2013 16:57:48 -0800 (PST) Received: by 10.68.145.40 with HTTP; Tue, 26 Nov 2013 16:57:48 -0800 (PST) In-Reply-To: References: <1385456084052-7579367.post@n2.nabble.com> <1385504882746-7579376.post@n2.nabble.com> Date: Wed, 27 Nov 2013 11:57:48 +1100 Message-ID: Subject: Re: Ensure there is one master From: Cameron McKenzie To: user@zookeeper.apache.org Content-Type: multipart/alternative; boundary=047d7b6d908af27cac04ec1e1545 X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6d908af27cac04ec1e1545 Content-Type: text/plain; charset=ISO-8859-1 Excuse my ignorance (I'm relatively new to ZK), but how does the accuracy of the clock affect this situation? On Wed, Nov 27, 2013 at 11:53 AM, Ted Dunning wrote: > This is not necessarily true. The old master may not have an accurate > clock. > > The ascending id idea that Alex mentions is a very nice way to put more > guarantees on the process. > > > > On Tue, Nov 26, 2013 at 2:58 PM, Alexander Shraer > wrote: > > > Cameron's solution basically relies on additional timing assumptions > > as Maciej mentions in his question. > > > > One more thing you could do is to implement increasing generation ids > > for masters, and have clients in your system reject commands from a > > master if they already know that a master with a higher generation id > > was elected (either because they saw a command from the new master or > > because they got a notification from ZK). This way each client can > > only have a single master and goes forward in time. > > > > Alex > > > > On Tue, Nov 26, 2013 at 2:34 PM, Cameron McKenzie > > wrote: > > > If I'm understanding your question correctly, you're worried that when > > the > > > current 'master' loses its connection to ZooKeeper, a new 'master' will > > be > > > elected and you will have 2 'master' nodes at the same time. As soon as > > you > > > lose a connection to ZooKeeper there are no guarantees about any of the > > > state that you're determining from it. When you lose the ZooKeeper > > > connection, your 'master' must assume that it is no longer a 'master' > > node > > > until it reconnects to ZooKeeper, at which point it will be able to > work > > > out what's going on. > > > > > > If you look at Apache Curator, its implementation of the Leader latch > > > recipe handles this loss of connection and reestablishment. > > > > > > cheers > > > Cam > > > > > > > > > On Wed, Nov 27, 2013 at 9:28 AM, ms209495 wrote: > > > > > >> Thanks for the reply. I want to clarify one thing. > > >> I think about a System of 20 nodes, that uses ZooKeeper of 3 nodes. > > >> I think about master election among these 20 nodes, that do not run > > >> consensus, but they use zookeeper service for master election. > > >> I used 'leader' term for a leeder in Zookeeper (among 3 nodes), and > > >> 'master' > > >> term for master in the System (20 nodes). > > >> Solution is described here: > > >> http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection(I > > >> would name it 'master' election, not 'leader' election), but I doubt > if > > it > > >> works reliable without additional timing assumptions as I described in > > my > > >> previous post. > > >> Please consider my previous post in the context of the System that > uses > > >> Zookeeper (not ZooKeeper itself). > > >> > > >> > > >> > > >> -- > > >> View this message in context: > > >> > > > http://zookeeper-user.578899.n2.nabble.com/Ensure-there-is-one-master-tp7579367p7579376.html > > >> Sent from the zookeeper-user mailing list archive at Nabble.com. > > >> > > > --047d7b6d908af27cac04ec1e1545--