Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D7FB318B53 for ; Sun, 21 Jun 2015 02:36:33 +0000 (UTC) Received: (qmail 45067 invoked by uid 500); 21 Jun 2015 02:36:32 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 45006 invoked by uid 500); 21 Jun 2015 02:36:32 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 44993 invoked by uid 99); 21 Jun 2015 02:36:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Jun 2015 02:36:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of shralex@gmail.com designates 209.85.223.170 as permitted sender) Received: from [209.85.223.170] (HELO mail-ie0-f170.google.com) (209.85.223.170) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 21 Jun 2015 02:34:17 +0000 Received: by iecrd14 with SMTP id rd14so96088377iec.3 for ; Sat, 20 Jun 2015 19:36:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=7v81T7rZmDI0Cg0xbicTREaUg6jAcfl4ZWTth2vhaC4=; b=JJdWSxSkEn8nzv2hxRvBo8GAE0tKEUjkH0u1WQvK3wzya+u3sZPlyUeQdxwxMtP5m5 yCY9k8VJnK3HzjcO6WlJ5qlQsRnTSeT4KFwztZE0norwL813D3kMM+5NBj1zeDAnN0IU ETUrgm+VqKthEa6DEL1+ejLjGrYx4N6GsMq0dfwby3xetCE3KWhCf74IGwU6eC/Fqu6Y rr6vNkuRxRiMJ2JAbrxnaqvYCYN7Dd+pkIpY8m4NiFQLLqlKW+7ldY1rKsNjpKDgfX/g V5YJ+J4AR5ozxDx7NI6A0//kZ7UeOmM+3VzXF+Z1YtTrdgMJl2LAMuAQ2vwYQKdoqerD QEHg== X-Received: by 10.107.6.136 with SMTP id f8mr30218492ioi.61.1434854165306; Sat, 20 Jun 2015 19:36:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.36.39.82 with HTTP; Sat, 20 Jun 2015 19:35:46 -0700 (PDT) In-Reply-To: References: From: Alexander Shraer Date: Sat, 20 Jun 2015 19:35:46 -0700 Message-ID: Subject: Re: Incrementally bootstrapping a 3.5.0-alpha cluster? To: "user@zookeeper.apache.org" Content-Type: multipart/alternative; boundary=001a113ee3ead3e72d0518fe04f1 X-Virus-Checked: Checked by ClamAV on apache.org --001a113ee3ead3e72d0518fe04f1 Content-Type: text/plain; charset=UTF-8 Hi, Approach 1 isn't supposed to work, since each server forms its own ensemble. Each server is the leader in its own ensemble so when you try to reconfigure it expects the other server to connect as a follower but that doesn't happen. The error just means that you can't reconfigure since you will loose a quorum (in an ensemble of 2 servers you must have both ack every request and here you won't have that since they are not talking). Approach 2 is supposed to work, no matter if the first server is 2 or 1. There may be a bug of course, but I just locally tried the scenario that fails for you (as I understood it) and it worked. Here is my setup, perhaps your can send me yours if it still doesn't work. server 1: dataDir=/home/shralex/zk-sat/zookeeper1 standaloneEnabled=false syncLimit=2 initLimit=5 tickTime=2000 server.1=localhost:2721:2731:participant;localhost:2791 server.2=localhost:2722:2732:participant;localhost:2792 server 2: dataDir=/home/shralex/zk-sat/zookeeper2 standaloneEnabled=false syncLimit=2 initLimit=5 tickTime=2000 server.2=localhost:2722:2732:participant;localhost:2792 starting server 2 first. it says its the leader. starting server 1. then connecting to server 2 with a client and issuing a reconfig adding server 1 Alex On Fri, Jun 19, 2015 at 6:27 PM, Benjamin Anderson wrote: > Hi there - I'm working on automating bootstrapping of a 3-node ZK > 3.5.0-alpha ensemble and I'm running in to some problems with getting > the nodes to join up. The dynamic configuration page[1] suggests that, > > "...it is possible to start a ZooKeeper ensemble containing a single > participant and to dynamically grow it by adding more servers" > > which is what I'm attempting to do. I've found, however, that this can > be rather problematic. What is the "correct" procedure for dynamically > growing an ensemble from a single participant? > > I've tried two approaches: > > Approach A: > > 1. Start two nodes, one with myid=1 and one with myid=2. Each node's > dynamicConfigFile contains a single line referring to itself, i.e., > neither node is aware of the other. > > 2. Open a zkCli to either of the two nodes and issue a `reconfig` > command to add the other, unknown node. > > This method fails with "KeeperErrorCode = NewConfigNoQuorum for". > > Approach B: > > 1. Start one node with myid=1 and a dynamicConfigFile that only refers > to itself, then start a second node with myid=2 and a > dynamicConfigFile that refers to itself *and* the node with myid=1. > > 2. Open a zkCli to the node with myid=1 and issue a reconfig command > to add the node with myid=2. > > This approach works! However, if the ordering is reversed (i.e., the > myid=2 node boots first and refers only to itself, and the myid=1 node > refers to both itself and the myid=2 node,) then the myid=1 node will > *never* come up cleanly - it hangs forever logging messages such as > the one in this gist[2]. In my environment the boot ordering is not > guaranteed, so this is rather challenging for me. > > My baseline config is roughly this[3]. > > Is there a well-known and reliable way to incrementally join nodes to > a ZK ensemble in 3.5.0-alpha? Do I need to be using a newer version > than the release cut back in August 2014? > > Thanks! > -- > b > > [1]: http://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html > [2]: https://gist.github.com/banjiewen/936f5620d33a8eb0ddf4 > [3]: https://gist.github.com/banjiewen/c7f11c749933ac1bab72 > --001a113ee3ead3e72d0518fe04f1--