zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sergei Babovich <sbabov...@demandware.com>
Subject Re: DR policies/HA setup in production - best practices
Date Mon, 03 Jan 2011 22:36:10 GMT
Thanks a lot! Really helped!
On 01/03/2011 05:31 PM, Mahadev Konar wrote:
> Sergei,
>   I think Ted already answered you question but in case you are interested in
> more details, please take a look at
> http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperInternals.html
> Thanks
> mahadev
> On 1/3/11 1:43 PM, "Ted Dunning"<ted.dunning@gmail.com>  wrote:
>> Actually, ZK is very good in this regard.
>> The lifetime of a single leader is denoted by an epoch number.  Transactions
>> are identified by an epoch and a sequence number assigned by the leader.
>>   Since there is only one leader and because all transactions are executed
>> serially, this
>> combination of epoch and transaction id uniquely specifies a transaction and
>> provides a complete ordering.
>> As transactions are committed, members of the committing quorum record the
>> latest epoch and transaction.
>> When you restart a cluster, the members of the cluster negotiate to
>> determine who has the latest transaction and then start from there.  As
>> such, it is probably a good idea to backup more than just one log+snapshot
>> so that you have a better chance of having a later copy.
>> On Mon, Jan 3, 2011 at 12:58 PM, Sergei Babovich
>> <sbabovich@demandware.com>wrote:
>>> It is also understood about DR strategy. What is the mechanism for ZK to
>>> resolve conflicts in such case? Let's say we have a primitive backup
>>> strategy of shipping logs every hour. In theory it means (assuming the worst
>>> case) that on DR site all servers will have snapshots of the data made at
>>> different point in time. When I bring the DR cluster up what is a protocol
>>> of resolving inconsistencies? That was a reason of my question - it felt
>>> (may be naively) that recovering by replicating from the single node data
>>> (snapshot+log) would be safer and more consistent approach - it is easier to
>>> make guaranties about result.

This e-mail message and all attachments transmitted with it may contain privileged and/or
confidential information intended solely for the use of the addressee(s). If the reader of
this message is not the intended recipient, you are hereby notified that any reading, dissemination,
distribution, copying, forwarding or other use of this message or its attachments is strictly
prohibited. If you have received this message in error, please notify the sender immediately
and delete this message, all attachments and all copies and backups thereof.

View raw message