Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 84744 invoked from network); 3 Jan 2011 22:36:38 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Jan 2011 22:36:38 -0000 Received: (qmail 91907 invoked by uid 500); 3 Jan 2011 22:36:38 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 91889 invoked by uid 500); 3 Jan 2011 22:36:38 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 91881 invoked by uid 99); 3 Jan 2011 22:36:38 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 22:36:38 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [207.97.245.161] (HELO smtp161.iad.emailsrvr.com) (207.97.245.161) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 22:36:32 +0000 Received: from smtp46.relay.iad1a.emailsrvr.com (localhost.localdomain [127.0.0.1]) by smtp46.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id A1C4EE8C8E for ; Mon, 3 Jan 2011 17:36:10 -0500 (EST) X-SMTPDoctor-Processed: csmtpprox 2.7.4 Received: from localhost (localhost.localdomain [127.0.0.1]) by smtp46.relay.iad1a.emailsrvr.com (SMTP Server) with ESMTP id 9E5F3E8C4E for ; Mon, 3 Jan 2011 17:36:10 -0500 (EST) X-Virus-Scanned: OK Received: by smtp46.relay.iad1a.emailsrvr.com (Authenticated sender: sbabovich-AT-demandware.com) with ESMTPA id 8196DE8C28 for ; Mon, 3 Jan 2011 17:36:10 -0500 (EST) Message-ID: <4D224F5A.604@demandware.com> Date: Mon, 03 Jan 2011 17:36:10 -0500 From: Sergei Babovich User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Fedora/3.0.10-1.fc12 Thunderbird/3.0.10 MIME-Version: 1.0 To: "user@zookeeper.apache.org" Subject: Re: DR policies/HA setup in production - best practices References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Thanks a lot! Really helped! On 01/03/2011 05:31 PM, Mahadev Konar wrote: > Sergei, > I think Ted already answered you question but in case you are interested in > more details, please take a look at > > http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperInternals.html > > Thanks > mahadev > > > On 1/3/11 1:43 PM, "Ted Dunning" wrote: > > >> Actually, ZK is very good in this regard. >> >> The lifetime of a single leader is denoted by an epoch number. Transactions >> are identified by an epoch and a sequence number assigned by the leader. >> Since there is only one leader and because all transactions are executed >> serially, this >> combination of epoch and transaction id uniquely specifies a transaction and >> provides a complete ordering. >> >> As transactions are committed, members of the committing quorum record the >> latest epoch and transaction. >> >> When you restart a cluster, the members of the cluster negotiate to >> determine who has the latest transaction and then start from there. As >> such, it is probably a good idea to backup more than just one log+snapshot >> so that you have a better chance of having a later copy. >> >> On Mon, Jan 3, 2011 at 12:58 PM, Sergei Babovich >> wrote: >> >> >>> It is also understood about DR strategy. What is the mechanism for ZK to >>> resolve conflicts in such case? Let's say we have a primitive backup >>> strategy of shipping logs every hour. In theory it means (assuming the worst >>> case) that on DR site all servers will have snapshots of the data made at >>> different point in time. When I bring the DR cluster up what is a protocol >>> of resolving inconsistencies? That was a reason of my question - it felt >>> (may be naively) that recovering by replicating from the single node data >>> (snapshot+log) would be safer and more consistent approach - it is easier to >>> make guaranties about result. >>> >>> >>> >> > This e-mail message and all attachments transmitted with it may contain privileged and/or confidential information intended solely for the use of the addressee(s). If the reader of this message is not the intended recipient, you are hereby notified that any reading, dissemination, distribution, copying, forwarding or other use of this message or its attachments is strictly prohibited. If you have received this message in error, please notify the sender immediately and delete this message, all attachments and all copies and backups thereof.