Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 67296 invoked from network); 18 Mar 2011 22:00:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Mar 2011 22:00:09 -0000 Received: (qmail 31583 invoked by uid 500); 18 Mar 2011 22:00:08 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 31561 invoked by uid 500); 18 Mar 2011 22:00:08 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 31553 invoked by uid 99); 18 Mar 2011 22:00:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Mar 2011 22:00:08 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.9] (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 18 Mar 2011 22:00:05 +0000 Received: (qmail 66796 invoked by uid 99); 18 Mar 2011 21:59:43 -0000 Received: from localhost.apache.org (HELO mail-iw0-f170.google.com) (127.0.0.1) (smtp-auth username breed, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 18 Mar 2011 21:59:43 +0000 Received: by iwn3 with SMTP id 3so5381951iwn.15 for ; Fri, 18 Mar 2011 14:59:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.42.150.202 with SMTP id b10mr2213480icw.397.1300485583322; Fri, 18 Mar 2011 14:59:43 -0700 (PDT) Received: by 10.42.131.129 with HTTP; Fri, 18 Mar 2011 14:59:43 -0700 (PDT) In-Reply-To: <649974.8956.qm@web130110.mail.mud.yahoo.com> References: <649974.8956.qm@web130110.mail.mud.yahoo.com> Date: Fri, 18 Mar 2011 14:59:43 -0700 Message-ID: Subject: Re: Using ZK for real-time group membership notification From: Benjamin Reed To: user@zookeeper.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org in a distributed setting such an answer is impossible. especially given the theory of relativity and the speed of light. a machine may fail right after sending a heart beat or another may come online right after sending a report. even if zookeeper could provide this you would still have thread scheduling issues on a local machine that means that you are operating on old information. to deal with this applications can use views, which allow clients to reconcile differences. for example, if two processes communicate and one has a different list of members than the other then they can both consult zookeeper to reconcile or use the membership list with the highest zxid. the other option is to count on eventually everyone converging. i would not develop a distributed system with the assumption that "all group members know *the exact number of members at all times*". ben On Fri, Mar 18, 2011 at 2:02 PM, Otis Gospodnetic wrote: > Hi, > > Short version: > How can ZK be used to make sure that all group members know *the exact nu= mber of > members at all times*? > > I have an app that can be run on 1 or more servers. =A0New instances of t= he app > come and go, may die, etc. -- the number of the app instances is complete= ly > dynamic. =A0At any one time, as these apps come and go, each live instanc= e of the > app needs to know how many instances are there total. =A0If a new instanc= e of the > app is started, all instances need to know the new total number of instan= ces. > If an app is stopped or if it dies, the remaining apps need to know the n= ew > number of app instances. > > Also, and this is critical, they need to know about these additions/remov= als of > apps right away and they all need to find out them at the same time. Basi= cally, > all members of some group need to know *the exact number of members at al= l > times*. > > This sounds almost like we need to watch a "parent group znode" and monit= or the > number of its ephemeral children, which represent each app instance that = is > watching the "parent groups znode". =A0Is that right? =A0If so, then all = I'd need to > know is the answer to "How many watchers are watching this znode?" of "Ho= w many > kids does this znode have?". And I'd need ZK to notify all watchers whene= ver the > answer to this question changes. =A0Ideally it would send/push the answer= (the > number of watchers) to all watchers, but if not, I assume any watcher tha= t is > notified about the change would go poll ZK to get the number of ephemeral= kids. > > I think the above is essentially what's described on > http://eng.wealthfront.com/2010/01/actually-implementing-group-management= .html , > but doesn't answer the part that's critical for me (the very first Q up a= bove). > > Thanks, > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > >