zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Flavio Junqueira <...@yahoo-inc.com>
Subject Re: Backups
Date Thu, 19 Jan 2012 18:39:15 GMT
Hi Ted, Znodes for leader election, group membership, etc, can all be  
recreated, so why should I back them up instead of recreating the  
znodes? In fact, one might bring back a previous snapshot of the  
system that reflects an incorrect system state.

In the case that one stores data that can't be recovered by other  
means, I understand the need, but then we have the durability problem  
that I mentioned and you apparently agreed. Also, ZooKeeper is a  
replicated service, so why can't you simply rely upon the replication  
strategy that ZooKeeper provides to you already? Again, I'm trying to  
understand the use cases here.

Thanks,
-Flavio

On Jan 19, 2012, at 7:11 PM, Ted Dunning wrote:

> A backup can still be useful.  It is a common property that a database
> backup is known to be slightly out of date.
>
> Such a backup can still be very useful.  In many systems, the most  
> common
> cause of error is simple human intervention.  This especially  
> applies to
> file systems and databases, but can still apply to ZK if an admin
> carelessly tries to clean up part of the namespace and accidentally  
> cleans
> up all of it.  This should be much less common with ZK because manual
> adjustments are so much less a part of standard operation, but they  
> can
> still occur.  In these cases, an out-of-date backup may be enormously
> valuable.
>
> If somebody wants a precise backup from a particular moment in time,  
> the
> best option is to use the snapshot capabilities exposed by various  
> file
> systems.  Traditional NAS vendors all support this.  At a lower cost  
> and
> complexity point, you can get this from MapR clusters exposed as NFS  
> or by
> a ZFS file system.  This option also allows you to keep multiple  
> snapshots
> from points in the past.
>
> What Jordan is doing would allow backups without special storage  
> devices
> and, with good backup of the log, would allow nearly current  
> recovery in
> the event of catastrophic loss.  Yes, this loses some durability,  
> but it is
> still very desirable.
>
> On Thu, Jan 19, 2012 at 11:07 AM, Flavio Junqueira <fpj@yahoo- 
> inc.com>wrote:
>
>> Since you started this thread, I've been thinking about the idea of
>> backing up, and I'm not sure I understand the motivation and if it  
>> is ok to
>> violate safety properties.
>>
>> Given that ZooKeeper is used for coordination, I would think that  
>> in many
>> cases all its state can be reconstructed in an algorithmic manner.  
>> Perhaps
>> the use case for a backup would be the one in which it is being  
>> used as a
>> database, for example, to keep the metadata of a file system.  
>> Periodic
>> backups or even keeping an observer, however, won't guarantee that  
>> if you
>> bring the system up using that backup you'll have all committed  
>> operations.
>> The state of the leader reflects all committed operations, but one  
>> needs to
>> have the latest state of the transaction log to not miss an update.
>>
>> But, it is true that I'm assuming that you can't miss updates. If  
>> you can
>> miss updates, then that's a different story. By missing updates  
>> we'll be
>> violating durability, which is  a property that ZooKeeper is  
>> supposed to
>> provide, so I'm trying to understand in which cases violating  
>> durability
>> would be acceptable. If it is not acceptable and you still want to  
>> have a
>> backup, then I don't see a way other than shutting down the clients  
>> before
>> you take a backup, which doesn't seem to be what is being proposed  
>> here.
>>
>> -Flavio
>>
>>
>> On Jan 18, 2012, at 1:38 AM, Jordan Zimmerman wrote:
>>
>> Neha - can you send me your email address. Send it to:
>>> jzimmerman@netflix.com
>>>
>>> On 1/17/12 10:10 AM, "Neha Narkhede" <neha.narkhede@gmail.com>  
>>> wrote:
>>>
>>> Jordan,
>>>>
>>>> I'd be interested in previewing it. Let me know.
>>>>
>>>> Thanks,
>>>> Neha
>>>>
>>>> On Mon, Jan 16, 2012 at 5:42 PM, Jordan Zimmerman
>>>> <jzimmerman@netflix.com> wrote:
>>>>
>>>>> We'll be backing up to S3. Wouldn't it be redundant to backup  
>>>>> all the
>>>>> instances?
>>>>>
>>>>> -JZ
>>>>>
>>>>> P.S. I'm working on a ZooKeeper instance manager that will have
>>>>> backup/restore and a bunch of other stuff. We'll be open  
>>>>> sourcing it. If
>>>>> anyone is interested in previewing it let me know.
>>>>>
>>>>>
>>>>> On 1/16/12 5:39 PM, "Patrick Hunt" <phunt@apache.org> wrote:
>>>>>
>>>>> Why would you limit to the leader? Wouldn't backing up any  
>>>>> server (as
>>>>>> long as it's active) be sufficient? If you search the list it's 

>>>>>> been
>>>>>> discussed before, using Observers seemed like a reasonable  
>>>>>> option as
>>>>>> well.
>>>>>>
>>>>>> Patrick
>>>>>>
>>>>>> On Fri, Jan 13, 2012 at 2:29 PM, Jordan Zimmerman
>>>>>> <jzimmerman@netflix.com> wrote:
>>>>>>
>>>>>>> That's easy as the backup app is running on the same machine
 
>>>>>>> as the ZK
>>>>>>> instance. I can use 'stat' to see if "my" instance is the  
>>>>>>> leader.
>>>>>>>
>>>>>>> On 1/13/12 2:28 PM, "Camille Fournier" <camille@apache.org>
 
>>>>>>> wrote:
>>>>>>>
>>>>>>> You want to have to figure out who the leader is every time 

>>>>>>> you want
>>>>>>>> to
>>>>>>>> take a backup? That would be the downside to this strategy
I  
>>>>>>>> would
>>>>>>>> think.
>>>>>>>>
>>>>>>>> C
>>>>>>>>
>>>>>>>> From my phone
>>>>>>>> On Jan 13, 2012 5:24 PM, "Jordan Zimmerman" <jzimmerman@netflix.com

>>>>>>>> >
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> As a backup strategy, it seems I would only want to backup
 
>>>>>>>> snapshots
>>>>>>>>> from
>>>>>>>>> the leader. Does that make sense?
>>>>>>>>>
>>>>>>>>> -JZ
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>> flavio
>> junqueira
>>
>> research scientist
>>
>> fpj@yahoo-inc.com
>> direct +34 93-183-8828
>>
>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>> phone (408) 349 3300    fax (408) 349 3301
>>
>>

flavio
junqueira

research scientist

fpj@yahoo-inc.com
direct +34 93-183-8828

avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message