zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: Backups
Date Fri, 20 Jan 2012 07:42:20 GMT
User error is a valid use case.  Are we assuming that because of user error
the ZK is not usable at this point? if not, can some one please explain how
having a back up can actually restore the data without bringing all zk
servers down and not disrupting the clients.

If we really want to take care of user error then what we need is probably
a way to go back to the state just before the transaction that messed up ZK
state. Can we not achieve this by providing a tool to generate snap and
transaction log such that when the server is re-started it starts exactly
from the transaction. We can do this by simply using the existing snapshot
files and transaction logs from any of the servers. Do we really need a
separate backup since the data is available on multiple servers.

We need a way to generate a snap shot that will take us to the exact time (
either using timestamp or transaction number). One problem i see is
probably zk cant go back in transaction number

Thoughts?


On Thu, Jan 19, 2012 at 11:42 AM, Ted Dunning <ted.dunning@gmail.com> wrote:

> That is one important case.  The offsite backup condition is probably well
> handled by a listener.
>
> On Thu, Jan 19, 2012 at 7:30 PM, Flavio Junqueira <fpj@yahoo-inc.com>
> wrote:
>
> > You're not talking about data corruption, are you? It is incorrect data
> > that has been introduced by a user or application by mistake. Am I
> getting
> > it right?
> >
> > -Flavio
> >
> >
> > On Jan 19, 2012, at 8:07 PM, Jordan Zimmerman wrote:
> >
> >  It's that very replication that creates the need for backups. In there
> is
> >> a user error or a bad injection of data, the error will quickly
> replicate
> >> to all the instances. There's no way to recover without an external
> >> backup.
> >>
> >>
> >> -JZ
> >>
> >>
> >> On 1/19/12 10:39 AM, "Flavio Junqueira" <fpj@yahoo-inc.com> wrote:
> >>
> >>  Hi Ted, Znodes for leader election, group membership, etc, can all be
> >>> recreated, so why should I back them up instead of recreating the
> >>> znodes? In fact, one might bring back a previous snapshot of the
> >>> system that reflects an incorrect system state.
> >>>
> >>> In the case that one stores data that can't be recovered by other
> >>> means, I understand the need, but then we have the durability problem
> >>> that I mentioned and you apparently agreed. Also, ZooKeeper is a
> >>> replicated service, so why can't you simply rely upon the replication
> >>> strategy that ZooKeeper provides to you already? Again, I'm trying to
> >>> understand the use cases here.
> >>>
> >>> Thanks,
> >>> -Flavio
> >>>
> >>> On Jan 19, 2012, at 7:11 PM, Ted Dunning wrote:
> >>>
> >>>  A backup can still be useful.  It is a common property that a database
> >>>> backup is known to be slightly out of date.
> >>>>
> >>>> Such a backup can still be very useful.  In many systems, the most
> >>>> common
> >>>> cause of error is simple human intervention.  This especially
> >>>> applies to
> >>>> file systems and databases, but can still apply to ZK if an admin
> >>>> carelessly tries to clean up part of the namespace and accidentally
> >>>> cleans
> >>>> up all of it.  This should be much less common with ZK because manual
> >>>> adjustments are so much less a part of standard operation, but they
> >>>> can
> >>>> still occur.  In these cases, an out-of-date backup may be enormously
> >>>> valuable.
> >>>>
> >>>> If somebody wants a precise backup from a particular moment in time,
> >>>> the
> >>>> best option is to use the snapshot capabilities exposed by various
> >>>> file
> >>>> systems.  Traditional NAS vendors all support this.  At a lower cost
> >>>> and
> >>>> complexity point, you can get this from MapR clusters exposed as NFS
> >>>> or by
> >>>> a ZFS file system.  This option also allows you to keep multiple
> >>>> snapshots
> >>>> from points in the past.
> >>>>
> >>>> What Jordan is doing would allow backups without special storage
> >>>> devices
> >>>> and, with good backup of the log, would allow nearly current
> >>>> recovery in
> >>>> the event of catastrophic loss.  Yes, this loses some durability,
> >>>> but it is
> >>>> still very desirable.
> >>>>
> >>>> On Thu, Jan 19, 2012 at 11:07 AM, Flavio Junqueira <fpj@yahoo-
> >>>> inc.com>wrote:
> >>>>
> >>>>  Since you started this thread, I've been thinking about the idea of
> >>>>> backing up, and I'm not sure I understand the motivation and if
it
> >>>>> is ok to
> >>>>> violate safety properties.
> >>>>>
> >>>>> Given that ZooKeeper is used for coordination, I would think that
> >>>>> in many
> >>>>> cases all its state can be reconstructed in an algorithmic manner.
> >>>>> Perhaps
> >>>>> the use case for a backup would be the one in which it is being
> >>>>> used as a
> >>>>> database, for example, to keep the metadata of a file system.
> >>>>> Periodic
> >>>>> backups or even keeping an observer, however, won't guarantee that
> >>>>> if you
> >>>>> bring the system up using that backup you'll have all committed
> >>>>> operations.
> >>>>> The state of the leader reflects all committed operations, but one
> >>>>> needs to
> >>>>> have the latest state of the transaction log to not miss an update.
> >>>>>
> >>>>> But, it is true that I'm assuming that you can't miss updates. If
> >>>>> you can
> >>>>> miss updates, then that's a different story. By missing updates
> >>>>> we'll be
> >>>>> violating durability, which is  a property that ZooKeeper is
> >>>>> supposed to
> >>>>> provide, so I'm trying to understand in which cases violating
> >>>>> durability
> >>>>> would be acceptable. If it is not acceptable and you still want
to
> >>>>> have a
> >>>>> backup, then I don't see a way other than shutting down the clients
> >>>>> before
> >>>>> you take a backup, which doesn't seem to be what is being proposed
> >>>>> here.
> >>>>>
> >>>>> -Flavio
> >>>>>
> >>>>>
> >>>>> On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote:
> >>>>>
> >>>>> Neha - can you send me your email address. Send it to:
> >>>>>
> >>>>>> jzimmerman@netflix.com
> >>>>>>
> >>>>>> On 1/17/12 10:10 AM, "Neha Narkhede" <neha.narkhede@gmail.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>> Jordan,
> >>>>>>
> >>>>>>>
> >>>>>>> I'd be interested in previewing it. Let me know.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Neha
> >>>>>>>
> >>>>>>> On Mon, Jan 16, 2012 at 5:42 PM, Jordan Zimmerman
> >>>>>>> <jzimmerman@netflix.com> wrote:
> >>>>>>>
> >>>>>>>  We'll be backing up to S3. Wouldn't it be redundant to
backup
> >>>>>>>> all the
> >>>>>>>> instances?
> >>>>>>>>
> >>>>>>>> -JZ
> >>>>>>>>
> >>>>>>>> P.S. I'm working on a ZooKeeper instance manager that
will have
> >>>>>>>> backup/restore and a bunch of other stuff. We'll be
open
> >>>>>>>> sourcing it. If
> >>>>>>>> anyone is interested in previewing it let me know.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On 1/16/12 5:39 PM, "Patrick Hunt" <phunt@apache.org>
wrote:
> >>>>>>>>
> >>>>>>>> Why would you limit to the leader? Wouldn't backing
up any
> >>>>>>>> server (as
> >>>>>>>>
> >>>>>>>>> long as it's active) be sufficient? If you search
the list it's
> >>>>>>>>> been
> >>>>>>>>> discussed before, using Observers seemed like a
reasonable
> >>>>>>>>> option as
> >>>>>>>>> well.
> >>>>>>>>>
> >>>>>>>>> Patrick
> >>>>>>>>>
> >>>>>>>>> On Fri, Jan 13, 2012 at 2:29 PM, Jordan Zimmerman
> >>>>>>>>> <jzimmerman@netflix.com> wrote:
> >>>>>>>>>
> >>>>>>>>>  That's easy as the backup app is running on the
same machine
> >>>>>>>>>> as the ZK
> >>>>>>>>>> instance. I can use 'stat' to see if "my" instance
is the
> >>>>>>>>>> leader.
> >>>>>>>>>>
> >>>>>>>>>> On 1/13/12 2:28 PM, "Camille Fournier" <camille@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> You want to have to figure out who the leader
is every time
> >>>>>>>>>> you want
> >>>>>>>>>>
> >>>>>>>>>>> to
> >>>>>>>>>>> take a backup? That would be the downside
to this strategy I
> >>>>>>>>>>> would
> >>>>>>>>>>> think.
> >>>>>>>>>>>
> >>>>>>>>>>> C
> >>>>>>>>>>>
> >>>>>>>>>>> From my phone
> >>>>>>>>>>> On Jan 13, 2012 5:24 PM, "Jordan Zimmerman"
> >>>>>>>>>>> <jzimmerman@netflix.com
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>  wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> As a backup strategy, it seems I would only
want to backup
> >>>>>>>>>>> snapshots
> >>>>>>>>>>>
> >>>>>>>>>>>> from
> >>>>>>>>>>>> the leader. Does that make sense?
> >>>>>>>>>>>>
> >>>>>>>>>>>> -JZ
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>  flavio
> >>>>> junqueira
> >>>>>
> >>>>> research scientist
> >>>>>
> >>>>> fpj@yahoo-inc.com
> >>>>> direct +34 93-183-8828
> >>>>>
> >>>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> >>>>> phone (408) 349 3300    fax (408) 349 3301
> >>>>>
> >>>>>
> >>>>>
> >>> flavio
> >>> junqueira
> >>>
> >>> research scientist
> >>>
> >>> fpj@yahoo-inc.com
> >>> direct +34 93-183-8828
> >>>
> >>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
> >>> phone (408) 349 3300    fax (408) 349 3301
> >>>
> >>>
> >>
> > flavio
> > junqueira
> >
> > research scientist
> >
> > fpj@yahoo-inc.com
> > direct +34 93-183-8828
> >
> > avinguda diagonal 177, 8th floor, barcelona, 08018, es
> > phone (408) 349 3300    fax (408) 349 3301
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message