Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 35216 invoked from network); 12 Oct 2010 16:54:39 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Oct 2010 16:54:39 -0000 Received: (qmail 33938 invoked by uid 500); 12 Oct 2010 16:54:39 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 33904 invoked by uid 500); 12 Oct 2010 16:54:38 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 33896 invoked by uid 99); 12 Oct 2010 16:54:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 16:54:38 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of avinash.lakshman@gmail.com designates 209.85.161.176 as permitted sender) Received: from [209.85.161.176] (HELO mail-gx0-f176.google.com) (209.85.161.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 16:54:31 +0000 Received: by gxk1 with SMTP id 1so1088901gxk.35 for ; Tue, 12 Oct 2010 09:54:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=ovlrkro5zLM0xxBSsPuV/QB2Td90yEqTRqvFf/2xQCM=; b=DbNbt35znAld13Ib3feA2IpvU8w2t9suOtDSDw1taFe4WBG5lFnl8H7PNt5SlL6BQC GXNC6pxrFINvgyvFqjD9n0hsSV23rTLqXgQFI6surUW91t+/2uFqjkjDqB35/wLQ4erI /NGMI+hnBorJqBAEm47zw4P8Di212PHdFEEGc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=jmJnXZEh/9O31AYS1hPescqVBw5yWCS2Zgtac8cSCrk6/zbVJPxx8Kt4Z+4HFILFS1 AJP09kgXUt1FXjQp08Rlo8cmNXI6WtVwDVU7bXz7R+0WOsRNPeQI4RxIO3df2eyjkLLR u6E66FGnCuQKThHDzmNkXNgqm9bcZjWPs/5VA= MIME-Version: 1.0 Received: by 10.42.132.66 with SMTP id c2mr667552ict.434.1286902450512; Tue, 12 Oct 2010 09:54:10 -0700 (PDT) Received: by 10.231.166.209 with HTTP; Tue, 12 Oct 2010 09:54:10 -0700 (PDT) Date: Tue, 12 Oct 2010 09:54:10 -0700 Message-ID: Subject: Membership using ZK From: Avinash Lakshman To: zookeeper-user Content-Type: multipart/alternative; boundary=90e6ba6e86d06c650004926e5392 --90e6ba6e86d06c650004926e5392 Content-Type: text/plain; charset=ISO-8859-1 This is what I have going: I have a bunch of 200 nodes come up and create an ephemeral entry under a znode names /Membership. When nodes are detected dead the node associated with the dead node under /Membership is deleted and watch delivered to the rest of the members. Now there are circumstances a node A is deemed dead while the process is still up and running on A. It is a false detection which I need to probably deal with. How do I deal with this situation? Over time false detections delete all the entries underneath the /Membership znode even though all processes are up and running. So my questions are: Would the watches be pushed out to the node that is falsely deemed dead? If so I can have that process recreate the ephemeral znode underneath /Membership. If a node leaves a watch and then truly crashes. When it comes back up would it get watches it missed during the interim period? In any case how do watches behave in the event of false/true failure detection? Thanks A --90e6ba6e86d06c650004926e5392--