Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@locus.apache.org Received: (qmail 87977 invoked from network); 16 Dec 2008 21:03:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 16 Dec 2008 21:03:51 -0000 Received: (qmail 6436 invoked by uid 500); 16 Dec 2008 21:04:03 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 6423 invoked by uid 500); 16 Dec 2008 21:04:03 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 6408 invoked by uid 99); 16 Dec 2008 21:04:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2008 13:04:03 -0800 X-ASF-Spam-Status: No, hits=-4.0 required=10.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [192.18.98.36] (HELO brmea-mail-4.sun.com) (192.18.98.36) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Dec 2008 21:03:41 +0000 Received: from fe-amer-10.sun.com ([192.18.109.80]) by brmea-mail-4.sun.com (8.13.6+Sun/8.12.9) with ESMTP id mBGL3KFk005605 for ; Tue, 16 Dec 2008 21:03:20 GMT Received: from conversion-daemon.mail-amer.sun.com by mail-amer.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) id <0KBZ00401KNQ2L00@mail-amer.sun.com> (original mail from Thomas.Johnson@Sun.COM) for zookeeper-user@hadoop.apache.org; Tue, 16 Dec 2008 14:03:20 -0700 (MST) Received: from [129.148.70.228] by mail-amer.sun.com (Sun Java System Messaging Server 6.2-8.04 (built Feb 28 2007)) with ESMTPSA id <0KBZ0043AMH8BI40@mail-amer.sun.com> for zookeeper-user@hadoop.apache.org; Tue, 16 Dec 2008 14:03:09 -0700 (MST) Date: Tue, 16 Dec 2008 16:04:29 -0500 From: Thomas Vinod Johnson Subject: Re: What happens when a server loses all its state? In-reply-to: Sender: Thomas.Johnson@Sun.COM To: zookeeper-user@hadoop.apache.org Message-id: <494817DD.70800@sun.com> MIME-version: 1.0 Content-type: text/plain; format=flowed; charset=ISO-8859-1 Content-transfer-encoding: 7BIT References: User-Agent: Thunderbird 2.0.0.9 (X11/20080119) X-Virus-Checked: Checked by ClamAV on apache.org Sorry, I should have been a little more explicit. At this point, the situation I'm considering is this; out of 3 servers, 1 server 'A' forgets its persistent state (due to a bad disk, say) and it restarts. My guess from what I could understand/reason about the internals was that the server 'A' will re-synchronize correctly on restart, by getting the entire snapshot. I just wanted to make sure that this was a good assumption to make - or find out if I was missing corner cases where the fact that A has lost all memory could lead to inconsistencies (to take an example, in plain Paxos, no acceptor can forget the highest number prepare request to which it has responded). More generally, is it a safe assumption to make that the ZooKeeper service will maintain all its guarantees if a minority of servers lose persistent state (due to bad disks, etc) and restart at some point in the future? Thanks. Mahadev Konar wrote: > Hi Thomas, > > If a zookeeper server loses all state and their are enough servers in the > ensemble to continue a zookeeper service ( like 2 servers in the case of > ensemble of 3), then the server will get the latest snapshot from the leader > and continue. > > > The idea of zookeeper persisting its state on disk is just so that it does > not lose state. All the guarantees that zookeeper makes is based on the > understanding that we do not lose state of the data we store on the disk. > > > Their might be problems if we lose the state that we stored on the disk. > We might lose transactions that have been committed and the ensemble might > start with some snapshot in the past. > > You might want ot read through how zookeeper internals work. This will help > you understand on why the persistence guarantees are required. > > http://wiki.apache.org/hadoop-data/attachments/ZooKeeper(2f)ZooKeeperPresent > ations/attachments/zk-talk-upc.pdf > > mahadev > > > > On 12/16/08 9:45 AM, "Thomas Vinod Johnson" wrote: > > >> What is the expected behavior if a server in a ZooKeeper service >> restarts with all its prior state lost? Empirically, everything seems to >> work*. Is this something that one can count on, as part of ZooKeeper >> design, or are there known conditions under which this could cause >> problems, either liveness or violation of ZooKeeper guarantees? >> >> I'm really most interested in a situation where a single server loses >> state, but insights into issues when more than one server loses state >> and other interesting failure scenarios are appreciated. >> >> Thanks. >> >> * The restarted server appears to catch up to the latest snapshot (from >> the current leader?). >> > >