Return-Path: Delivered-To: apmail-hadoop-zookeeper-dev-archive@locus.apache.org Received: (qmail 96041 invoked from network); 18 Jul 2008 14:42:20 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 18 Jul 2008 14:42:20 -0000 Received: (qmail 26265 invoked by uid 500); 18 Jul 2008 14:42:20 -0000 Delivered-To: apmail-hadoop-zookeeper-dev-archive@hadoop.apache.org Received: (qmail 26245 invoked by uid 500); 18 Jul 2008 14:42:20 -0000 Mailing-List: contact zookeeper-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-dev@hadoop.apache.org Delivered-To: mailing list zookeeper-dev@hadoop.apache.org Received: (qmail 26061 invoked by uid 99); 18 Jul 2008 14:42:19 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jul 2008 07:42:19 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [217.12.15.82] (HELO rsmtp2.corp.ukl.yahoo.com) (217.12.15.82) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Jul 2008 14:41:22 +0000 Received: from chargeofferlx (chargeoffer-lx.barcelona.corp.yahoo.com [10.78.36.30]) (authenticated bits=0) by rsmtp2.corp.ukl.yahoo.com (8.13.8/8.13.8/y.rout) with ESMTP id m6IEeLMV096042 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO) for ; Fri, 18 Jul 2008 14:40:22 GMT DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=from:to:references:subject:date:message-id:mime-version: content-type:content-transfer-encoding:x-mailer:in-reply-to:x-mimeole:thread-index; b=02rT2+8dLnZ1CMXWu4asP84AwNhF/6T5O7o+7jpWCzA2MS6WlZBUZ2L5n/kq1zMN From: "Flavio Junqueira " To: References: Subject: RE: javadoc for the Write Lock / Leader Election Date: Fri, 18 Jul 2008 16:40:21 +0200 Message-ID: <000a01c8e8e4$3547bae0$70060a0a@ds.corp.yahoo.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 In-Reply-To: X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3198 Thread-Index: Acjo35l1d7k55gYTQd+c0U5Q34CnMgAAGHEg X-Virus-Checked: Checked by ClamAV on apache.org Hi James, the fact that the client's node has another node n ahead of it the in the sequence order doesn't mean that the owner of n is aware that it is the lock holder or the leader. This is because operations are propagated asynchronously. Also, a getChildren() doesn't guarantee that you have the latest list, and it is possible that another node is at the head of the ordered list of nodes at the moment you read the response of getChildren(). This is because getChildren() will return the local state of one server, while the ensemble of servers is processing or have even already decided upon a change to the list. In the way I understand Jacob's suggestion, a leader client creates a separate node to acknowledge that it is actually aware that it is the leader, and so it is ready to perform the role of a leader. -Flavio > -----Original Message----- > > One thing confused me though; the last paragraph says... > > This protocol guarantees that there is at any time only one node that > thinks it is the leader. But it does not disseminate information about > who is the leader. If you want everyone to know who is the leader, you > can have an additional Znode whose value is the name of the current > leader (or some identifying information on how to contact the leader, > etc.). Note that this cannot be done atomically, so by the time other > nodes find out who the leader is, the leadership may already have > passed on to a different node. > > In the current implementation, WriteLock - each znode can know, > whenever it attempts to acquire the lock - if it didn't get the lock, > who the owner is. I guess this is only true momentarily the split > second that the acquire() method is called (i.e. the exact moment the > getChildren() is called and the lowest value is found). Or is there > some other subtle issue I'm not seeing? > > I guess we could add a method to WriteLock - if folks wanted - a kinda > queryLeader() method where we just use the same algorithm to find who > the current leader is - if folks cared. Though am not sure how useful > knowing who the leader is :). Though I guess writing the leader's > identity to some canonical znode that any other znode can read > whenever it wishes is less risky and maybe simpler. > > -- > James > ------- > http://macstrac.blogspot.com/ > > Open Source Integration > http://open.iona.com