zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Missing session state handling in most Leader Election implementations
Date Tue, 15 Nov 2011 02:24:31 GMT
On Mon, Nov 14, 2011 at 2:41 PM, Jordan Zimmerman <jzimmerman@netflix.com>wrote:

> It turns out that this is tricky to solve. When the server you're
> connected to goes down, you get a Watcher.Event.KeeperState.Disconnected.
> However, it could be that you are able to reconnect to another server so
> the disconnected event should be ignored.

The event should not be ignored.  The master should pause in being a
master, but not unload any major data structures.  If it reconnects
instantly, then it should continue as if nothing had happened.  You can
also have a time limit for how long you wait before you decide to pause
operation as master.  As you increase that time, you increase the
probability of two masters existing at the same time.  If the reconnect
happens before the timeout, you don't need to both the master.

> My solution is to watch for
> Watcher.Event.KeeperState.Disconnected and then execute a sync() (using
> the currently configured retry policy). If that sync fails, Curator will
> call the unhandledError() method of the LeaderSelectorListener. This seems
> like the best way to handle this. Thoughts?

I don't like this as well as pausing.

> As an aside, as part of working on this I now have a TestingCluster class
> that will create, in memory, n ZooKeeper servers in an ensemble. This
> could be useful to everyone :)


Can you inject events into this cluster?  That is what I have missed.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message