zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: How to join quorum without restarting existing servers
Date Tue, 05 Nov 2013 18:18:38 GMT
I am attaching log file. Could you take a look why the new instance cannot
join quorum?


On Tue, Nov 5, 2013 at 9:52 AM, Bae, Jae Hyeon <metacret@gmail.com> wrote:

> Thanks a lot Ben
>
> We are also using zookeeper in AWS with elastic IP. Why I asked this
> question is, when the bad Zookeeper EC2 instance is terminated and new
> instance is launched with the previous elastic IP, it cannot join quorum
> without any specific error messages. But when I did rolling restart, the
> new instance started normally, synchronized and joined quorum.
>
> As I understand German's response, the new instance should start,
> synchronize, and join quorum successfully without any impact on existing
> instances but it didn't. I will investigate further.
>
> Thank you
> Best, Jae
>
>
> On Tue, Nov 5, 2013 at 8:24 AM, Ben Hall <ben@zynga.com> wrote:
>
>> Hi Jae,
>>
>> I wrote that article several years ago. (tbh - I hope it is not totally
>> out of date by now).  I agree with German's points.
>>
>> The issue it was solving was to replace a bad server without having to
>> shutdown the ensemble and without having to update the config files on
>> each server. I would also add that this only works as long as the server
>> names and ports are the same - iirc at the time the article was written we
>> were using servers in AWS and referencing them either by assigned
>> hostnames such as zookeeper-[01|11] or by elastic IP's that could be moved
>> from server to server.
>>
>> If I understand your question correctly, if you are "adding a new server"
>> such as going from 7 to 9 servers, then this approach won't benefit you as
>> you.
>>
>> We also used this approach when we would upgrade the servers, but like
>> German said we did it one server at a time so that the Leader election
>> could be natural.  This allowed us to upgrade a pool of 11 servers who
>> were responsible for many thousands of client connections without any down
>> time.
>>
>> Thanks
>> Ben
>>
>>
>> On 11/5/13 6:51 AM, "German Blanco" <german.blanco.blanco@gmail.com>
>> wrote:
>>
>> >... and make sure that there is no rubbish in the data dir of the new
>> >server.
>> >
>> >
>> >On Tue, Nov 5, 2013 at 3:49 PM, German Blanco <
>> >german.blanco.blanco@gmail.com> wrote:
>> >
>> >> Hello Jae,
>> >>
>> >> I think that the answer to your question is "no, there is no benefit in
>> >>a
>> >> rolling restart in that case".
>> >> If you remove a machine that was hosting a zookeeper server that was
>> >>part
>> >> of a cluster, and replace it with a new machine, with a zookeeper
>> server
>> >> running the same software version and listening on the same IP and
>> >>ports,
>> >> then this new server will join the cluster, synchronize and start
>> >>working
>> >> normally.
>> >> I wouldn't recommend to replace more than one server at a time, and I
>> >> think that it is better if the new server joins while the existing
>> >>quorum
>> >> is stable (avoid leader elections while the new server joins, i.e.
>> avoid
>> >> restarts or disconnections of the existing servers).
>> >>
>> >> Best regards,
>> >>
>> >> Germán.
>> >>
>> >>
>> >> On Tue, Nov 5, 2013 at 6:42 AM, Bae, Jae Hyeon <metacret@gmail.com>
>> >>wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I read an article
>> >>>
>> >>>
>> >>>
>> http://www.benhallbenhall.com/2011/07/rolling-restart-in-apache-zookeepe
>> >>>r-to-dynamically-add-servers-to-the-ensemble/
>> >>>
>> >>> My question is, even though failed hardware is replaced with the same
>> >>>IP
>> >>> address, do I need to do rolling restart for adding replaced hardware
>> >>>to
>> >>> the quorum?
>> >>>
>> >>> I am using zookeeper ver3.4.5.
>> >>>
>> >>> Thank you
>> >>> Best, Jae
>> >>>
>> >>
>> >>
>>
>>
>

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message