zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ralph tice <ralph.t...@gmail.com>
Subject Re: Changing leader to follower?
Date Sat, 11 Oct 2014 18:09:59 GMT
I'm not an expert but I don't think there is a magic bullet here, leader
election has to happen in this circumstance and that takes time.

You may be better served by building better resilience to eliminate
ZooKeeper's uptime from being a single point of failure in your services
layer.  Pinterest and Airbnb both have some prior art here,
http://engineering.pinterest.com/post/77933733851/zookeeper-resilience-at-pinterest
and http://nerds.airbnb.com/smartstack-service-discovery-cloud/

I'm curious why you chose a cross-DC ensemble versus localized same-region
ensembles.  Don't you deal with a significant frequency of leader elections
from being in 3 regions anyway?


On Sat, Oct 11, 2014 at 11:21 AM, Jeff Potter <
jpotter-zookeeper@codepuppy.com> wrote:

>
> The reason I ask is that we’ve noticed, when running zookeeper cross-DC,
> that restarting the node that’s currently the leader causes a brief but
> real service interruption for 3 to 5 seconds while the rest of the cluster
> elects a new leader and syncs. We’re on AWS, with 2 ZK nodes in US-East, 2
> in US-West-2, and 1 in US-West (as a tie-breaker).
>
> It would seem taking a leader to follower status would be useful; and
> doing so without it actually being a stop / disconnect on all clients
> connect to the node. (Especially for doing rolling restarts of all nodes,
> e.g. XEN-108 bug.)
>
> -Jeff
>
>
>
> On Oct 10, 2014, at 10:16 AM, Ivan Kelly <ivank@apache.org> wrote:
>
> > Or just pause the process until someone else takes over.
> >
> > 1. kill -STOP <zookeeper_pid>
> > 2. // wait for election to happen
> > 3. kill -CONT <zookeeper_pid>
> >
> > This wont top it from becoming leader again. Also, client may migrate to
> > other servers.
> >
> > -Ivan
> >
> > Alexander Shraer writes:
> >
> >> Hi,
> >>
> >> I don't think there's a direct way, although this seems a useful thing
> to
> >> add.
> >>
> >> One think you could do is to issue a reconfig changing the leader's
> >> leading/quorum port (through which
> >> it talks with the followers). This will cause it to give up leadership
> >> while keeping it in the cluster.
> >>
> >> Cheers,
> >> Alex
> >>
> >> On Fri, Oct 10, 2014 at 5:57 AM, Jeff Potter <
> >> jpotter-zookeeper@codepuppy.com> wrote:
> >>
> >>>
> >>> Hi,
> >>>
> >>> Is there a way to “retire” a leader while keeping it in the cluster?
> >>>
> >>> Thanks,
> >>> Jeff
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message