zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: FYI - Apache ZooKeeper Backup, a Treatise
Date Thu, 16 Jun 2016 21:41:09 GMT
Contrary to recommendations everywhere, my experience is that almost everyone is storing source
of truth data in ZooKeeper. It’s just too tempting. You have a distributed file system just
sitting there and it’s too easy to use. You get a lot of great features like watches, etc.
People are using it to store configuration data, sequence numbers, etc. They are storing these
things without a good means of reproducing them in case of a catastrophic outage. Further,
I’ve heard of several orgs who just back up the transaction logs and think they can restore
them for DR. Anyway, that’s the genesis of my blog post.


> On Jun 16, 2016, at 2:39 PM, Chris Nauroth <cnauroth@hortonworks.com> wrote:
> Yes, thank you to Jordan for the article!
> Like Flavio, I personally have never come across the requirement for
> ZooKeeper backups.  I've generally followed the pattern that data stored
> in ZooKeeper is truly transient, and applications are built either to
> tolerate loss of that data or reconstruct it from first principles if it
> goes missing.  Adding observers in a second data center would give a
> rudimentary approximation of off-site backup in the case of a data center
> disaster, with the usual caveats around propagation delays.
> Jordan, I'd be curious if you can share more specific details about the
> kind of data that you have that necessitates a backup/restore.  (If you're
> not at liberty to share this, then I can understand that.)  It might
> inform if we have a motivating use case for backup/restore features within
> ZooKeeper, such as some of the transaction log filtering that the article
> mentions.
> --Chris Nauroth
> On 6/16/16, 1:03 AM, "Flavio Junqueira" <fpj@apache.org> wrote:
>> Great write-up, Jordan, thanks!
>> Whether to backup zk data or not is possibly an open topic for this
>> community, even though we have discussed it at times. My sense has been
>> that precisely because of the issues you mention in your post, it is
>> typically best to have a way to recreate its data upon a disaster rather
>> than backup the data. I think there could be three general scenarios in
>> which folks would prefer to backup data, but you correct me if these
>> aren't accurate:
>> - The data in zk isn't elsewhere, so it can't be recreated: zk isn't a
>> regular database, so I'd think it is best not to store data and focus on
>> cluster data or metadata.
>> - There is a just a lot of data and I'd rather have a shorter time to
>> recover: zk in general shouldn't have that much data in db, but let's go
>> with the assumption that for the requirements of the application it is a
>> lot. For such a case, it probably depends on whether your application can
>> efficiently and effectively recover from a backup. Basically, as pointed
>> out in the post, the data could be inconsistent and cause trouble if you
>> don't think about the corner cases.
>> - The code to recreate the zk metadata for my application is super
>> complex: if you decide to code against zk, it is good to think whether
>> reconstructing in the case of a disaster is doable and if it is design
>> and implement to reconstruct the state upon a disaster.
>> Also, we typically provision enough replicas, often replicating across
>> data centers, to make sure that the data isn't all gone. Having more
>> replicas does not rule out completely the possibility of a disaster, but
>> in such rare cases we resort to the expensive path.
>> I personally have never worked with an application that was taking
>> backups of zk data in prod, so I'm really interested in what others
>> think. 
>> -Flavio
>>> On 16 Jun 2016, at 00:43, Jordan Zimmerman <jordan@jordanzimmerman.com>
>>> wrote:
>>> FYI - I wrote a blog about backing up ZooKeeper:
>>> https://www.elastic.co/blog/zookeeper-backup-a-treatise
>>> -Jordan

View raw message