zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Karol Dudzinski <karoldudzin...@gmail.com>
Subject Re: Leader election duration
Date Tue, 28 Apr 2015 19:22:41 GMT
Well these are prod clusters so my ability to experiment is rather limited.  I can take a copy
of the snapshot and try both 3 node and 5 in a test cluster.

One thing I forgot to mention is that in most clusters the number of election notification
log lines I see is typically, give or take, the same as the number of participants.  In this
cluster however, it's typically 2 or 3 times as many notifications as the number of participants.

My gut feeling is it's more likely to be due to load as the 5 node cluster is much busier
and the election time has been increasing over time (as has load).  I have no idea exactly
what load though, whether it's number of clients, frequency of transactions, total data size,
etc.  I don't understand why though but that may just be my limited knowledge of the election


> On 28 Apr 2015, at 19:54, Camille Fournier <camille@apache.org> wrote:
> Just out of curiosity, if you start the 5 node cluster up with only 3 of
> the nodes to begin with (like, config 5, but only bring up 3 processes),
> does it speed up the leader election or is it still slow?
> C
> On Tue, Apr 28, 2015 at 1:41 PM, Karol Dudzinski <karoldudzinski@gmail.com>
> wrote:
>> Hi,
>> We're seeing some rather strange leader election in one of our clusters.
>> The duration reported by the "FOLLOWING - LEADER ELECTION TOOK" log line
>> (and equivalent for the leader) seems to vary hugely.  During one rolling
>> reboot, I saw the number reported as small as 39ms and as large as 57
>> seconds (difference in units is not a typo).  The average is just about 10
>> seconds and std dev also about 10 seconds.  So the time taken is not only
>> quite large, it's also very variable.
>> We have other clusters but the average election time in those is in the
>> hundreds of millis with std dev in a similar ballpark.  I guess one
>> difference is the "slow" cluster is 5 participants while the others are 3,
>> which may be a factor but I wouldn't expect it to make two orders of
>> magnitude difference!
>> So my question is, what factors contribute to the election time reported
>> by these log lines? And what can we do to speed this up?
>> As far as I understand from logs and a quick browse through the code that
>> time is the time to select a leader.  Syncing up to the leader happens
>> after that.  The syncing part I can understand will vary depending on load
>> but I don't see why selecting the leader would.
>> Thanks,
>> Karol

View raw message