zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@gmail.com>
Subject Re: Incrementally bootstrapping a 3.5.0-alpha cluster?
Date Thu, 25 Jun 2015 21:36:25 GMT
This message itself doesn't indicate a failure, its quite normal. But if
you have a situation where the ensemble gets stuck or doesn't elect a
leader, please open a jira and post your server logs.

Thanks,
Alex

On Thu, Jun 25, 2015 at 9:43 AM, Benjamin Anderson <b@banjiewen.net> wrote:

> Hi Alexander, I've had much better luck with the codebase @91ecdac,
> but I've still observed the "Have smaller server identifier" type
> failure at least once. It's reliable enough for me to work around the
> remaining failures, at least.
>
> Thanks!
> --
> b
>
> On Wed, Jun 24, 2015 at 8:20 AM, Alexander Shraer <shralex@gmail.com>
> wrote:
> > Hi Benjamin, I'm curious if this worked
> >
> > thanks,
> > Alex
> >
> > On Sat, Jun 20, 2015 at 7:40 PM, Alexander Shraer <shralex@gmail.com>
> wrote:
> >
> >> There were bug fixes since the 2014 release. So if it doesn't work
> perhaps
> >> you could try with trunk:
> >>
> >> svn checkout http://svn.apache.org/repos/asf/zookeeper/trunk <local
> dir>
> >>
> >> On Sat, Jun 20, 2015 at 7:35 PM, Alexander Shraer <shralex@gmail.com>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Approach 1 isn't supposed to work, since each server forms its own
> >>> ensemble. Each server is the leader in its own ensemble
> >>> so when you try to reconfigure it expects the other server to connect
> as
> >>> a follower but that doesn't happen. The error just means that you can't
> >>> reconfigure since you will loose a quorum (in an ensemble of 2 servers
> you
> >>> must have both ack every request and here you won't have that since
> they
> >>> are not talking).
> >>>
> >>> Approach 2 is supposed to work, no matter if the first server is 2 or
> 1.
> >>> There may be a bug of course, but I just locally tried the scenario
> that
> >>> fails for you (as I understood it) and it worked. Here is my setup,
> perhaps
> >>> your can send me yours if it still doesn't work.
> >>>
> >>> server 1:
> >>> dataDir=/home/shralex/zk-sat/zookeeper1
> >>> standaloneEnabled=false
> >>> syncLimit=2
> >>> initLimit=5
> >>> tickTime=2000
> >>> server.1=localhost:2721:2731:participant;localhost:2791
> >>> server.2=localhost:2722:2732:participant;localhost:2792
> >>>
> >>> server 2:
> >>> dataDir=/home/shralex/zk-sat/zookeeper2
> >>> standaloneEnabled=false
> >>> syncLimit=2
> >>> initLimit=5
> >>> tickTime=2000
> >>> server.2=localhost:2722:2732:participant;localhost:2792
> >>>
> >>> starting server 2 first. it says its the leader. starting server 1.
> then
> >>> connecting to server 2 with a client and issuing a reconfig adding
> server 1
> >>>
> >>> Alex
> >>>
> >>>
> >>>
> >>> On Fri, Jun 19, 2015 at 6:27 PM, Benjamin Anderson <b@banjiewen.net>
> >>> wrote:
> >>>
> >>>> Hi there - I'm working on automating bootstrapping of a 3-node ZK
> >>>> 3.5.0-alpha ensemble and I'm running in to some problems with getting
> >>>> the nodes to join up. The dynamic configuration page[1] suggests that,
> >>>>
> >>>> "...it is possible to start a ZooKeeper ensemble containing a single
> >>>> participant and to dynamically grow it by adding more servers"
> >>>>
> >>>> which is what I'm attempting to do. I've found, however, that this can
> >>>> be rather problematic. What is the "correct" procedure for dynamically
> >>>> growing an ensemble from a single participant?
> >>>>
> >>>> I've tried two approaches:
> >>>>
> >>>> Approach A:
> >>>>
> >>>> 1. Start two nodes, one with myid=1 and one with myid=2. Each node's
> >>>> dynamicConfigFile contains a single line referring to itself, i.e.,
> >>>> neither node is aware of the other.
> >>>>
> >>>> 2. Open a zkCli to either of the two nodes and issue a `reconfig`
> >>>> command to add the other, unknown node.
> >>>>
> >>>> This method fails with "KeeperErrorCode = NewConfigNoQuorum for".
> >>>>
> >>>> Approach B:
> >>>>
> >>>> 1. Start one node with myid=1 and a dynamicConfigFile that only refers
> >>>> to itself, then start a second node with myid=2 and a
> >>>> dynamicConfigFile that refers to itself *and* the node with myid=1.
> >>>>
> >>>> 2. Open a zkCli to the node with myid=1 and issue a reconfig command
> >>>> to add the node with myid=2.
> >>>>
> >>>> This approach works! However, if the ordering is reversed (i.e., the
> >>>> myid=2 node boots first and refers only to itself, and the myid=1 node
> >>>> refers to both itself and the myid=2 node,) then the myid=1 node will
> >>>> *never* come up cleanly - it hangs forever logging messages such as
> >>>> the one in this gist[2]. In my environment the boot ordering is not
> >>>> guaranteed, so this is rather challenging for me.
> >>>>
> >>>> My baseline config is roughly this[3].
> >>>>
> >>>> Is there a well-known and reliable way to incrementally join nodes to
> >>>> a ZK ensemble in 3.5.0-alpha? Do I need to be using a newer version
> >>>> than the release cut back in August 2014?
> >>>>
> >>>> Thanks!
> >>>> --
> >>>> b
> >>>>
> >>>> [1]: http://zookeeper.apache.org/doc/trunk/zookeeperReconfig.html
> >>>> [2]: https://gist.github.com/banjiewen/936f5620d33a8eb0ddf4
> >>>> [3]: https://gist.github.com/banjiewen/c7f11c749933ac1bab72
> >>>>
> >>>
> >>>
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message