zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Backups
Date Thu, 19 Jan 2012 19:40:15 GMT
Well, "snap the datadir" is definitely one step better than "copy the
datadir", but your point is well taken that this isn't the strongest story.
 At least snaps take essentially zero time and are guaranteed to be
consistent copies of files.  Most snaps also are incremental which means
that they tend to copy less than everything.

Also, any time for a snapshot will be arbitrary in some sense.

The problem that I see with writing a copy from the server is that we can't
guarantee that all the values get written at the same time.  If we had an
ummutable table structure (I suggested this once, Thomas K suggested it
also), then this wouldn't be a big deal since we would just make a snapshot
of the table to write.  With our current data structure, this isn't nice.
 It also isn't nice to increase the load on the server just to get a backup.

On Thu, Jan 19, 2012 at 6:24 PM, Patrick Hunt <phunt@apache.org> wrote:

> I don't think we have a "backup" story today. "copy the datadir" is
> not a great story. You could for example get snaps/txnlogs that are
> only partially written. Now this is fine from the perspective that a
> ZK server can recover from that, but EOD it's pretty ugly. Also
> requires you to copy the entire datadir, and not just the most recent
> "known good" snap/txnlog file(s). Not to mention the issue we talked
> about before - you're getting a copy from some unknown point in time,
> largely defined by how up to date the server is with the leader. It
> seems to me that if we really want to support backing up the servers
> we need a better story than this. Perhaps some tool which can ask a
> server to generate a "backup" (but only if it's reasonably up to date
> with the leader, most importantly that it's actually active in the
> ensemble, etc...), ensure that the file creation happened successfully
> (ie verify the output files), then copy that result, rather than the
> "copy the datadir" approach we have today.
> Patrick
> On Thu, Jan 19, 2012 at 10:16 AM, Jordan Zimmerman
> <jzimmerman@netflix.com> wrote:
> > Ted - are you referring to my original plan to backup the transaction
> logs
> > or the new idea of backing up certain nodes?
> >
> > -JZ
> >
> > On 1/19/12 10:11 AM, "Ted Dunning" <ted.dunning@gmail.com> wrote:
> >
> >>What Jordan is doing would allow backups without special storage devices
> >>and, with good backup of the log, would allow nearly current recovery in
> >>the event of catastrophic loss.  Yes, this loses some durability, but it
> >>is
> >>still very desirable.
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message