I am attaching log file. Could you take a look why the new instance cannot join quorum?

On Tue, Nov 5, 2013 at 9:52 AM, Bae, Jae Hyeon <metacret@gmail.com> wrote:
Thanks a lot Ben

We are also using zookeeper in AWS with elastic IP. Why I asked this question is, when the bad Zookeeper EC2 instance is terminated and new instance is launched with the previous elastic IP, it cannot join quorum without any specific error messages. But when I did rolling restart, the new instance started normally, synchronized and joined quorum.

As I understand German's response, the new instance should start, synchronize, and join quorum successfully without any impact on existing instances but it didn't. I will investigate further.

Thank you
Best, Jae

On Tue, Nov 5, 2013 at 8:24 AM, Ben Hall <ben@zynga.com> wrote:
Hi Jae,

I wrote that article several years ago. (tbh - I hope it is not totally
out of date by now).  I agree with German's points.

The issue it was solving was to replace a bad server without having to
shutdown the ensemble and without having to update the config files on
each server. I would also add that this only works as long as the server
names and ports are the same - iirc at the time the article was written we
were using servers in AWS and referencing them either by assigned
hostnames such as zookeeper-[01|11] or by elastic IP's that could be moved
from server to server.

If I understand your question correctly, if you are "adding a new server"
such as going from 7 to 9 servers, then this approach won't benefit you as

We also used this approach when we would upgrade the servers, but like
German said we did it one server at a time so that the Leader election
could be natural.  This allowed us to upgrade a pool of 11 servers who
were responsible for many thousands of client connections without any down


On 11/5/13 6:51 AM, "German Blanco" <german.blanco.blanco@gmail.com> wrote:

>... and make sure that there is no rubbish in the data dir of the new
>On Tue, Nov 5, 2013 at 3:49 PM, German Blanco <
>german.blanco.blanco@gmail.com> wrote:
>> Hello Jae,
>> I think that the answer to your question is "no, there is no benefit in
>> rolling restart in that case".
>> If you remove a machine that was hosting a zookeeper server that was
>> of a cluster, and replace it with a new machine, with a zookeeper server
>> running the same software version and listening on the same IP and
>> then this new server will join the cluster, synchronize and start
>> normally.
>> I wouldn't recommend to replace more than one server at a time, and I
>> think that it is better if the new server joins while the existing
>> is stable (avoid leader elections while the new server joins, i.e. avoid
>> restarts or disconnections of the existing servers).
>> Best regards,
>> Germán.
>> On Tue, Nov 5, 2013 at 6:42 AM, Bae, Jae Hyeon <metacret@gmail.com>
>>> Hi
>>> I read an article
>>> My question is, even though failed hardware is replaced with the same
>>> address, do I need to do rolling restart for adding replaced hardware
>>> the quorum?
>>> I am using zookeeper ver3.4.5.
>>> Thank you
>>> Best, Jae