zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: Leader election
Date Thu, 06 Dec 2018 19:09:32 GMT
> it seems like the
> inconsistency may be caused by the partition of the Zookeeper cluster
> itself

Yes - there are many ways in which you can end up with 2 leaders. However, if properly tuned
and configured, it will be for a few seconds at most. During a GC pause no work is being done
anyway. But, this stuff is very tricky. Requiring an atomically unique leader is actually
a design smell and you should reconsider your architecture.

> Maybe we can use a more
> lightweight Hazelcast for example?

There is no distributed system that can guarantee a single leader. Instead you need to adjust
your design and algorithms to deal with this (using optimistic locking, etc.).

-Jordan

> On Dec 6, 2018, at 1:52 PM, Michael Borokhovich <michaelbor@gmail.com> wrote:
> 
> Thanks Jordan,
> 
> Yes, I will try Curator.
> Also, beyond the problem described in the Tech Note, it seems like the
> inconsistency may be caused by the partition of the Zookeeper cluster
> itself. E.g., if a "leader" client is connected to the partitioned ZK node,
> it may be notified not at the same time as the other clients connected to
> the other ZK nodes. So, another client may take leadership while the
> current leader still unaware of the change. Is it true?
> 
> Another follow up question. If Zookeeper can guarantee a single leader, is
> it worth using it just for leader election? Maybe we can use a more
> lightweight Hazelcast for example?
> 
> Michael.
> 
> 
> On Thu, Dec 6, 2018 at 4:50 AM Jordan Zimmerman <jordan@jordanzimmerman.com>
> wrote:
> 
>> It is not possible to achieve the level of consistency you're after in an
>> eventually consistent system such as ZooKeeper. There will always be an
>> edge case where two ZooKeeper clients will believe they are leaders (though
>> for a short period of time). In terms of how it affects Apache Curator, we
>> have this Tech Note on the subject:
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10 <
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10> (the
>> description is true for any ZooKeeper client, not just Curator clients). If
>> you do still intend to use a ZooKeeper lock/leader I suggest you try Apache
>> Curator as writing these "recipes" is not trivial and have many gotchas
>> that aren't obvious.
>> 
>> -Jordan
>> 
>> http://curator.apache.org <http://curator.apache.org/>
>> 
>> 
>>> On Dec 5, 2018, at 6:20 PM, Michael Borokhovich <michaelbor@gmail.com>
>> wrote:
>>> 
>>> Hello,
>>> 
>>> We have a service that runs on 3 hosts for high availability. However, at
>>> any given time, exactly one instance must be active. So, we are thinking
>> to
>>> use Leader election using Zookeeper.
>>> To this goal, on each service host we also start a ZK server, so we have
>> a
>>> 3-nodes ZK cluster and each service instance is a client to its dedicated
>>> ZK server.
>>> Then, we implement a leader election on top of Zookeeper using a basic
>>> recipe:
>>> https://zookeeper.apache.org/doc/r3.1.2/recipes.html#sc_leaderElection.
>>> 
>>> I have the following questions doubts regarding the approach:
>>> 
>>> 1. It seems like we can run into inconsistency issues when network
>>> partition occurs. Zookeeper documentation says that the inconsistency
>>> period may last “tens of seconds”. Am I understanding correctly that
>> during
>>> this time we may have 0 or 2 leaders?
>>> 2. Is it possible to reduce this inconsistency time (let's say to 3
>>> seconds) by tweaking tickTime and syncLimit parameters?
>>> 3. Is there a way to guarantee exactly one leader all the time? Should we
>>> implement a more complex leader election algorithm than the one suggested
>>> in the recipe (using ephemeral_sequential nodes)?
>>> 
>>> Thanks,
>>> Michael.
>> 
>> 


Mime
View raw message