zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <fpjunque...@yahoo.com>
Subject Re: leader election doesn't settle
Date Sat, 31 May 2014 09:32:50 GMT
Please consider committing ZK-1810 as well! :-)

-Flavio

On 31 May 2014, at 04:29, Michi Mutsuzaki <michi@cs.stanford.edu> wrote:

> Thanks Flavio. I guess it's time for me to upgrade to 3.4.6 :)
> 
> Regarding the socket timeout, I see that the timeout is set to
> self.tickTime * self.initLimit in connectToLeader(). I'm using the
> default values for both tickTime and initLimit, so it should have
> timed out sooner. I'll double check these settings and the time the
> leader got killed.
> 
> Thanks!
> --Michi
> 
> 
> On Fri, May 30, 2014 at 1:30 AM, FPJ <fpjunqueira@yahoo.com> wrote:
>> Hi Michi,
>> 
>> 1) The follower stops following the leader when it gets an exception on the
>> socket (Follower.followLeader):
>>              ...
>>              while (self.isRunning()) {
>>                    readPacket(qp);
>>                    processPacket(qp);
>>                }
>>            } catch (Exception e) {
>>            ...
>> 
>>      I believe we are setting the timeout like this: self.tickTime *
>> self.initLimit. Check connectToLeader().
>> 
>> 2) I believe we fixed this bug in 3.4.6 and the change is pending for trunk.
>> Check ZK-1808 for 3.4.6 and ZK-1810 for trunk.
>> 
>> -Flavio
>> 
>> 
>>> -----Original Message-----
>>> From: mutsuzaki@gmail.com [mailto:mutsuzaki@gmail.com] On Behalf Of
>>> Michi Mutsuzaki
>>> Sent: 30 May 2014 06:30
>>> To: user@zookeeper.apache.org
>>> Subject: leader election doesn't settle
>>> 
>>> I have a 3 server cluster using ZooKeeper 3.4.5 with server IDs 61, 150,
>> and
>>> 228, and 150 is the leader. I shut down 150. I have 2 questions.
>>> 
>>> 1) Both 61 and 228 takes about 5 minutes to detect that the leader died.
>> Is
>>> there a tcp setting I need to tune to make this quicker?
>>> 
>>> https://paste.apache.org/4AFR?action=download
>>> 
>>> 2) Leader election between 61 and 228 never settles. 61 doesn't seem to
>>> receive notification from 228, and 228 keeps receiving notification from
>> 61 for
>>> the previous epoch. I restarted 61 and the leader election settled. Have
>> you
>>> guys seen this behavior?
>>> 
>>> https://paste.apache.org/37vU?action=download
>> 


Mime
View raw message