Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A058CEF7A for ; Sat, 12 Jan 2013 10:31:28 +0000 (UTC) Received: (qmail 45840 invoked by uid 500); 12 Jan 2013 10:31:28 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 45261 invoked by uid 500); 12 Jan 2013 10:31:23 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 45213 invoked by uid 99); 12 Jan 2013 10:31:21 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Jan 2013 10:31:21 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of hulunbier@gmail.com designates 209.85.216.179 as permitted sender) Received: from [209.85.216.179] (HELO mail-qc0-f179.google.com) (209.85.216.179) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 12 Jan 2013 10:31:14 +0000 Received: by mail-qc0-f179.google.com with SMTP id b14so1626193qcs.38 for ; Sat, 12 Jan 2013 02:30:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=5ZeuuDnWAPXjh2GEfrK3MDDTEh+O7ZCkVejMLrhbAQw=; b=ayk2uqF7Rbu4wbwyDURjpxacQlTC/jSOecaUh8DCV8EYYgBMm0p403Q06c47Y/l06C We5XdXvEUsZNkT6NjCfDGxIraelGULt+lAmFa9dCqFHtsuk75Jxfn+v9q62zWDolP3Zh Jw6uhLEnOklk/S4Bm+QJhxBtcd5nCsyoZhumjk5PJuzhm5eXNM4CL1kSCa/mexlRNAr9 hPtSEWto7AknQr/FehiLOh2RQLtMD7lQ87fHRR2Q+cQhnpSUtHLJf5TGTslQSD1bY4LV C0GKQxMKwxzTOrJPlA1wcIU2zmv5ILSnAJ0UXtUa2u3ou+q1QHoy6CgcUMrT1fPeetdM G8KA== MIME-Version: 1.0 Received: by 10.224.207.72 with SMTP id fx8mr23839125qab.66.1357986653839; Sat, 12 Jan 2013 02:30:53 -0800 (PST) Received: by 10.49.28.162 with HTTP; Sat, 12 Jan 2013 02:30:53 -0800 (PST) In-Reply-To: References: Date: Sat, 12 Jan 2013 18:30:53 +0800 Message-ID: Subject: Re: Getting confused with the "recipe for lock" From: Hulunbier To: user@zookeeper.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Thanks Jordan, > If client1's hearbeat fails its main watcher will get a Disconnect event Suppose the network link betweens client1 and server is at very low quality (high packet loss rate?) but still fully functional. Client1 may be happily sending heart-beat-messages to server without notice anything; but ZK server could be unable to receive heart-beat-messages from client1 for a long period of time , which leads ZK server to timeout client1's session, and delete the ephemeral node. Thus, client's session could be timeouted by ZK server, without triggering a Disconnect event. >Well behaving ZK applications must watch for this and assume that it no lo= nger holds the lock and, thus, should delete its node. If client1 needs the= lock again it should try to re-acquire it from step 1 of the recipe. Furth= er, well behaving ZK applications must re-try node deletes if there is a co= nnection problem. Have a look at Curator's implementation for details. Thanks for pointing me the "Curator's implementation", I will dig into the source code. But I still feels that, no matter how well a ZK application behaves, if we use ephemeral node in the lock-recipe; we can not guarantee "at any snapshot in time no two clients think they hold the same lock", which is the fundamental requirement/constraint for a lock. Mr. Andrey Stepachev suggested that I should use a timer in client side to track session_timeout, that sounds reasonable; but I think this implicitly implies some constrains of clock drift - which I am not expected in a solution based on Zookeeper (ZK is supposed to keep the animals well). On Sat, Jan 12, 2013 at 4:20 AM, Jordan Zimmerman wrote: > > If client1's hearbeat fails its main watcher will get a Disconnect event.= Well behaving ZK applications must watch for this and assume that it no lo= nger holds the lock and, thus, should delete its node. If client1 needs the= lock again it should try to re-acquire it from step 1 of the recipe. Furth= er, well behaving ZK applications must re-try node deletes if there is a co= nnection problem. Have a look at Curator's implementation for details. > > -JZ > > On Jan 11, 2013, at 5:46 AM, Zhao Boran wrote: > > > While reading the zookeeper's recipe for > > lock, > > I get confused: > > > > Seems that this recipe-for-distributed-lock can not guarantee *"any > > snapshot in time no two clients think they hold the same lock"*. > > > > But since zookeeper is so widely adopted, if there were such mistakes i= n > > the reference doc, someone should have pointed it out long time ago. > > > > So, what did I misunderstand? please help me! > > > > Recipe-for-distributed-lock (from > > http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks) > > > > Locks > > > > Fully distributed locks that are globally synchronous, *meaning at any > > snapshot in time no two clients think they hold the same lock*. These c= an > > be implemented using ZooKeeeper. As with priority queues, first define = a > > lock node. > > > > 1. Call create( ) with a pathname of "*locknode*/guid-lock-" and the > > sequence and ephemeral flags set. > > 2. Call getChildren( ) on the lock node without setting the watch fla= g > > (this is important to avoid the herd effect). > > 3. If the pathname created in step 1 has the lowest sequence number > > suffix, the client has the lock and the client exits the protocol. > > 4. The client calls exists( ) with the watch flag set on the path in = the > > lock directory with the next lowest sequence number. > > 5. if exists( ) returns false, go to step 2. Otherwise, wait for a > > notification for the pathname from the previous step before going to = step 2. > > > > Considering the following case: > > > > - > > > > Client1 successfully acquired the lock(in step3), with zk node > > "locknode/guid-lock-0"; > > - > > > > Client2 created node "locknode/guid-lock-1", failed to acquire the lo= ck, > > and watching "locknode/guid-lock-0"; > > - > > > > Later, for some reasons(network congestion?), client1 failed to send > > heart beat message to zk cluster on time, but client1 is still perfec= tly > > working, and assuming itself still holding the lock. > > - > > > > But, Zookeeper may think client1's session is timeouted, and then > > 1. deletes "locknode/guid-lock-0" > > 2. sends a notification to Client2 (or send the notification first= ?) > > 3. but can not send "session timeout" notification to client1 in t= ime > > (due to network congestion?) > > > > > > - > > > > Client2 got the notification, goes to step 2, gets the only node > > ""locknode/guid-lock-1", which is created by itself; thus, client2 as= sumes > > it hold the lock. > > - > > > > But at the same time, client1 assumes it hold the lock. > > > > Is this a valid scenario? > > > > Thanks a lot! >