zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Recommended ZooKeeper backup procedures/scripts
Date Thu, 20 Oct 2011 21:45:25 GMT
On Thu, Oct 20, 2011 at 11:35 AM, Mike Schilli <m@perlmeister.com> wrote:

> On Wed, 19 Oct 2011, Patrick Hunt wrote:
>  Rsync is probably sufficient.
> Interesting ... could one use rsync for BCP purposes?


But I would worry just a bit about doing so.  You run the slight risk of
getting part of snapshot.  If you keep several, then you know you will get a
complete one, but there is a bit of a question when your copy isn't a point
in time copy.  The question arises if your copy happens in a bad order so
that you get all of something including references to something that you
only got part of.

If you can do a true snapshot somewhere, you will be safer from this
possibly only theoretical worry.

> So if I have two farms of zookeepers, could I rsync data from one farm
> to the other, shut down the first and start up the other?

Almost certainly.

And if you could do a first rsync, then shut down the first, and then do
another touchup rsync before starting up the other, you definitely can do
it.   The second rsync will be very fast and is nice insurance.

I presume that you guys recommend using ZooKeeper's quorum model
> for BCP purposes, but how would you typically distribute ZooKeeper
> instances between different colos? Say, you have N colos, how many ZooKeeper
> instances
> would you need to form a quorum, even if 1, or 2, or N-1 colos fail?

Running Zookeeper between data centers is not a problem, but is not usually
recommended without careful analysis.  The basic problem is that ZK is very
conservative and will require more than half of your assets to be up to
continue function.  A second problem is that the propagation delay between
the data centers can wreak havoc.

For example, if you have two colos, and 2 ZooKeeper instances in the
> first and 3 ZooKeeper instances in the other, you'd have a total of 5
> instances which is a valid quorum. But as soon as the colo with 3
> instances goes down, you're left with 2 instances which is an even
> number. What do you guys recommend?

One trick is to run a single instance on either side and put a tie-breaker
instance in EC2.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message