Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 647A818273 for ; Tue, 3 Nov 2015 18:45:23 +0000 (UTC) Received: (qmail 61729 invoked by uid 500); 3 Nov 2015 18:45:22 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 61679 invoked by uid 500); 3 Nov 2015 18:45:22 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 61668 invoked by uid 500); 3 Nov 2015 18:45:22 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 61665 invoked by uid 99); 3 Nov 2015 18:45:22 -0000 Received: from Unknown (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Nov 2015 18:45:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 113EE1A2CD8 for ; Tue, 3 Nov 2015 18:45:22 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2 X-Spam-Level: ** X-Spam-Status: No, score=2 tagged_above=-999 required=6.31 tests=[FSL_HELO_BARE_IP_2=1.999, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-eu-west.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id aUMMXYpksI-k for ; Tue, 3 Nov 2015 18:45:10 +0000 (UTC) Received: from relayvx12c.securemail.intermedia.net (relayvx12c.securemail.intermedia.net [64.78.52.187]) by mx1-eu-west.apache.org (ASF Mail Server at mx1-eu-west.apache.org) with ESMTPS id D409024D7B for ; Tue, 3 Nov 2015 18:45:09 +0000 (UTC) Received: from securemail.intermedia.net (localhost [127.0.0.1]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id 522FB53EA3; Tue, 3 Nov 2015 10:45:02 -0800 (PST) Subject: Re: Why can't an ephemeral non sequential node creation be used as a lock? MIME-Version: 1.0 x-echoworx-msg-id: 12b2ae50-cb55-4f70-af02-a5343d39badd x-echoworx-emg-received: Tue, 3 Nov 2015 10:45:02.274 -0800 x-echoworx-action: delivered Received: from 10.254.155.17 ([10.254.155.17]) by emg-ca-1-2 (JAMES SMTP Server 2.3.2) with SMTP ID 73; Tue, 3 Nov 2015 10:45:02 -0800 (PST) Received: from MBX080-W4-CO-2.exch080.serverpod.net (unknown [10.224.117.102]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by emg-ca-1-2.localdomain (Postfix) with ESMTPS id 12FE853EB0; Tue, 3 Nov 2015 10:45:02 -0800 (PST) Received: from MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) by MBX080-W4-CO-2.exch080.serverpod.net (10.224.117.102) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Tue, 3 Nov 2015 10:45:00 -0800 Received: from MBX080-W4-CO-2.exch080.serverpod.net ([10.224.117.102]) by mbx080-w4-co-2.exch080.serverpod.net ([10.224.117.102]) with mapi id 15.00.1044.021; Tue, 3 Nov 2015 10:45:00 -0800 From: Chris Nauroth To: "user@zookeeper.apache.org" , "zookeeper-user@hadoop.apache.org" Thread-Topic: Why can't an ephemeral non sequential node creation be used as a lock? Thread-Index: AQHRFmWwyB5GWB2bJ0mv08zilksiuZ6KkegA Date: Tue, 3 Nov 2015 18:44:59 +0000 Message-ID: References: <732D3AF1-0E42-452B-AABE-6FEA9F63B2F0@gmail.com> In-Reply-To: <732D3AF1-0E42-452B-AABE-6FEA9F63B2F0@gmail.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [50.248.208.113] x-source-routing-agent: Processed Content-Type: text/plain; charset="us-ascii" Content-ID: <706C3372D58EED41996C7BDE10FB04A9@exch080.serverpod.net> Content-Transfer-Encoding: quoted-printable Hi Kamel, The implementation you described (ephemeral but not sequential) can be prone to the thundering herd effect [1]. One client will obtain the lock, and the other 2 will need to call exists to set a watch on the lock znode created by that client. This is important to detect if the client holding the lock releases that lock (either intentionally or unintentionally due to process death), so that another one of the client processes can acquire the lock and make progress. Since all clients set a watch on that same znode, all clients will wake up and try to acquire the lock again by recreating that znode, but only one can succeed. The standard recipe (ephemeral and sequential) does not suffer from the thundering herd effect. This is because each sequential znode is only watched by a single other client, so the release of the lock (deletion of the znode) only wakes up a single other client. More details on this are discussed in the recipe documentation [2]. This problem is somewhat analogous to Object#notify [3] vs. Object#notifyAll [4] in concurrent Java programming. This might not be a big deal with only 3 clients. With a large number of clients contending on the lock, the thundering herd effect can generate a lot of wasteful extra work. I hope this helps. [1] https://en.wikipedia.org/wiki/Thundering_herd_problem [2] http://zookeeper.apache.org/doc/r3.4.6/recipes.html#sc_recipes_Locks [3] http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#notify() [4]=20 http://docs.oracle.com/javase/7/docs/api/java/lang/Object.html#notifyAll() --Chris Nauroth On 11/3/15, 10:30 AM, "kamel.zaarouri@gmail.com" wrote: >Hi, > >I have 3 zookeeper clients that will receive a request within seconds >apart. Only 1 client is allowed to handle this request. > >I was thinking that each client will try to create the same ephemeral >node non sequential node. The client that can is by definition the leader >and is the one that will handle the request. > >But then I saw that there's a recipe for creating a lock. > >Would the above strategy work or should I use the recipe? Can someone >tell me what could go wrong with what I described? > >Thanks=20