hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henry Robinson <he...@cloudera.com>
Subject Re: zookeeper on ec2
Date Mon, 06 Jul 2009 19:40:48 GMT
On Mon, Jul 6, 2009 at 7:38 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

>
> I think that the misunderstanding is that this on-disk image is critical to
> cluster function.  It is not critical because it is replicated to all
> cluster members.  This means that any member can disappear and a new
> instance can replace it with no big cost other than the temporary load of
> copying the current snapshot from some cluster member.
>

This is an interesting way of doing things. It seems like there is a
correctness issue: if a majority of servers fail, with the remaining
minority lagging the leader for some reason, won't the ensemble's current
state be forever lost? This is akin to a majority of servers failing and
never recovering. ZK relies on the eventual liveness of a majority of its
servers; with EC2 it seems possible that that property might not be
satisfied.

(For majority, you can read 'quorum' under the flexible quorums scheme;
perhaps there is a way to devise a quorum scheme suitable for elastic
computing...)

Henry



>
> On Mon, Jul 6, 2009 at 11:33 AM, Mahadev Konar <mahadev@yahoo-inc.com
> >wrote:
>
> >  In the documentation of zookeeper, I have read that
> > > zookeeper saves snapshots of the in-memory data in the file system. Is
> > > that needed for recovery? Logically, it would be much easier for me if
> > > this is not the case.
> > Yes, zookeeper keeps persistent state on disk. This is used for recovery
> > and
> > correctness of zookeeper.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message