zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tecno Brain <cerebrotecnolog...@gmail.com>
Subject Re: Zookeeper own leader election
Date Thu, 19 Apr 2018 01:51:38 GMT
Hi Jordan,

Correct, I know that the internal leader election has nothing to do with
the leader election of my application through Curator.

What we are observing is that when restarting (or killing) 1 or 2 servers
from a Zookeeper ensemble of 5 nodes this is  triggering a leader election
of my application.
Our expectation is that this should not occur, since I still have quorum in
the Zookeeper ensemble.
Is that the correct expectation ?






On Wed, Apr 18, 2018 at 6:08 PM, Jordan Zimmerman <
jordan@jordanzimmerman.com> wrote:

> The term "leader election" has two meanings here. The kind of leader
> election that your application uses with Apache Curator is different from
> the internal leader election that ZooKeeper does amongst its nodes. For
> clarity, the internal leader election should probably be renamed to "master
> election" or something. In a ZooKeeper ensemble one instance is always
> chosen as the leader/master. All writes, etc. go through this master. If
> this master instance goes down (due to crash, restart, chaos monkey, etc.)
> then the ensemble must choose a new leader/master. This is simply how
> ZooKeeper works.
>
> >    - If the Zookeeper leader node fails, are all sessions lost?
>
> No. Sessions are transactions in the ZK database like any other. When a
> new ZK leader/master is elected the sessions will continue. In fact, the
> session time is reset as the leader/master sets the status of time "0".
>
> >    - What parameters control how quickly the zookeeper nodes elect a new
> > leader?
>
> I believe "initLimit" is the most important one here (others can correct
> me).
>
> >    - Can I have longer timeouts in my application before giving up
> > leadership than that of the zookeeper nodes?
>
> I don't totally understand this question. The internal leader/master
> election has nothing whatever to do with Apache Curator leaders.
>
> -Jordan
>
> > On Apr 19, 2018, at 7:32 AM, Tecno Brain <cerebrotecnologico@gmail.com>
> wrote:
> >
> > Hi,
> >  I have a cluster of five Zookeeper nodes.
> >
> >  I have an application deployed in two other servers that execute a
> leader
> > election process using the Curator recipe (
> > https://curator.apache.org/curator-recipes/leader-election.html)
> >
> >  My DevOps has been executing a ChaosMonkey type of test and they
> > complained that my application triggered a change in leadership when
> > they *restarted
> > two of the Zookeeper* nodes (the leader node and an extra node).
> >
> >  I find it normal, but they claim that the application should let the
> > Zookeeper nodes elect its own new leader and my application should not
> > change leadership because the current leader did not fail, the failure
> was
> > in the Zookeeper cluster.
> >
> >  So, my question is:
> >    - If the Zookeeper leader node fails, are all sessions lost?
> >    - What parameters control how quickly the zookeeper nodes elect a new
> > leader?
> >    - Can I have longer timeouts in my application before giving up
> > leadership than that of the zookeeper nodes?
> >
> > My application currently runs an "expensive" task when taking leadership,
> > therefore we want to minimize the change of leadership, specially if it
> > wasn't because the application failed, but rather because the Zookeeper
> > cluster was unstable.
> >
> > I want to understand Zookeeper own leadership election process to be able
> > to either modify the Curator recipe or have a solid argument to explain
> > that what I am asked to do is not possible.
> > Any pointers are welcome.
> >
> > -J
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message