zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tecno Brain <cerebrotecnolog...@gmail.com>
Subject Zookeeper own leader election
Date Wed, 18 Apr 2018 22:32:40 GMT
Hi,
  I have a cluster of five Zookeeper nodes.

  I have an application deployed in two other servers that execute a leader
election process using the Curator recipe (
https://curator.apache.org/curator-recipes/leader-election.html)

  My DevOps has been executing a ChaosMonkey type of test and they
complained that my application triggered a change in leadership when
they *restarted
two of the Zookeeper* nodes (the leader node and an extra node).

  I find it normal, but they claim that the application should let the
Zookeeper nodes elect its own new leader and my application should not
change leadership because the current leader did not fail, the failure was
in the Zookeeper cluster.

  So, my question is:
    - If the Zookeeper leader node fails, are all sessions lost?
    - What parameters control how quickly the zookeeper nodes elect a new
leader?
    - Can I have longer timeouts in my application before giving up
leadership than that of the zookeeper nodes?

My application currently runs an "expensive" task when taking leadership,
therefore we want to minimize the change of leadership, specially if it
wasn't because the application failed, but rather because the Zookeeper
cluster was unstable.

I want to understand Zookeeper own leadership election process to be able
to either modify the Curator recipe or have a solid argument to explain
that what I am asked to do is not possible.
Any pointers are welcome.

-J

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message