zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Kelly <iv...@apache.org>
Subject Re: Ensure there is one master
Date Thu, 28 Nov 2013 18:04:51 GMT
Since there is always a period between checking you are master and
performing an action as master, so can't guarantee that another node
hasn't taken mastership before you perform the action. However, if you
are using some shared storage for the state of the system, you can
block other masters from writing to it while you are master,
preventing split brain from occurring. 

The simplest way to do this is to store all your state in
zookeeper. With this, if B is partitioned away, it will not be able to
update the state. This won't scale too far though, as zk holds
everything in memory. It may work if the system isn't too big though.

Another solution is to use
Bookkeeper(http://zookeeper.apache.org/bookkeeper), or another shared
storage with fencing, to sequence the updates to your
state. Bookkeeper is a distributed write ahead log, which writes
entries to a quorum before responding to client. It has a fencing
mechanism which sends a 'fence' message to at least one node in each
quorum, blocking all further writes to that log. In your case, if a
node in B is master and is putting all state updates into a bookkeeper
log before applying them, and the there is a partition and a node in A
becomes master, A will fence B's log before applying any state updates
of it's own.

Yet another solution, though I don't know how well it would work, if
to use locks in NFS. If B is logging to a file on a SAN, it get
an exclusive lock on the file handle. This will block anyone else from
logging to it until B's NFS session goes away. I'm not sure how long
it takes for sessions to timeout though, or how widely implemented or
reliable this part of the NFS spec is though.

Hope this helps,


On Tue, Nov 26, 2013 at 12:54:44AM -0800, ms209495 wrote:
> Hi,
> ZooKeeper is an excellent system but the problem with ensuring only one
> master among clients bothers me.
> Lets have a look at the situation when network partition happen: there is
> part A (majority), and part B (minority).
> Lets assume that before network partition happened the master was connected
> to part B.
> After the network partition, part A will elect new ZooKeeper leader, and
> there will be new master elected among clients connected to part A.
> At this time there are two masters - old in part B, and new in part A.
> The only solution I can think about to this problem, is to ensure that the
> new master is inactive for some time - to ensure that the old master in this
> time will detect that it is not connected to ZooKeeper quorum, and will
> deactivate itself as a master.
> This solution assumes that timers on these machines work correctly.
> Is it possible to ensure only one master using ZooKeeper without timing
> assumptions ?
> Thanks,
> Maciej
> --
> View this message in context: http://zookeeper-user.578899.n2.nabble.com/Ensure-there-is-one-master-tp7579367.html
> Sent from the zookeeper-user mailing list archive at Nabble.com.

View raw message