Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 44322 invoked from network); 4 Feb 2010 23:01:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Feb 2010 23:01:02 -0000 Received: (qmail 26602 invoked by uid 500); 4 Feb 2010 23:01:02 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 26577 invoked by uid 500); 4 Feb 2010 23:01:02 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 26567 invoked by uid 99); 4 Feb 2010 23:01:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Feb 2010 23:01:02 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of g.kishore@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 04 Feb 2010 23:00:53 +0000 Received: by vws6 with SMTP id 6so620989vws.35 for ; Thu, 04 Feb 2010 15:00:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=6SmDMBr8Z+zy/yDBHNU4E1PynMZJPIXvhfFiZq5/PB8=; b=ag2v7NYuPMkM8KQdwRoBp3mGFmjldN0Zmk+g1i+KCA59FIuMjsBmM5goIunple5Iox cqphcYmIORmQHN+9uy1GB/4YfYXl87EJJtEi6qLSnEhOeKA68YziApe5qSVlBRuOsAua Jnj53saNapszAJG4C4HD7XXCic6v9NWLtIlxE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=i6sAt/9Wn4HHASQ+txLWJzMVBONvN19F02BtatCRbi2c+8iUuNcZptsp8s2Zr/12t1 B5f0wRbJnrltzLgvJYMiqnudUE/rodWZMkHqS+44ZEZ2VG9BcWNgEGksaQal1+JJIysf Ph8oUhAvhJBuk2s3AmZCtihjp04tax/i7YLwQ= MIME-Version: 1.0 Received: by 10.220.108.27 with SMTP id d27mr1593030vcp.64.1265324429785; Thu, 04 Feb 2010 15:00:29 -0800 (PST) In-Reply-To: <4B6B508C.4000406@apache.org> References: <4B6B4B89.6070603@yahoo-inc.com> <4B6B508C.4000406@apache.org> Date: Thu, 4 Feb 2010 15:00:29 -0800 Message-ID: Subject: Re: ephemeral node after server bounce From: kishore g To: zookeeper-user@hadoop.apache.org Cc: "yonik@lucidimagination.com" Content-Type: multipart/alternative; boundary=00c09f8de4f02a65a5047ece4d0f --00c09f8de4f02a65a5047ece4d0f Content-Type: text/plain; charset=ISO-8859-1 Worst case option would be to have jvm shutdownhooks http://stackoverflow.com/questions/40376/handle-signals-in-the-java-virtual-machine You can delete the znodes on exit. More like deleteOnExit functionality of a File thanks, Kishore G On Thu, Feb 4, 2010 at 2:56 PM, Patrick Hunt wrote: > hah, you guys beat me to the punch. I think having some unique per client > token might also work (see my resp). Perhaps this is the ip of the host or > better (esp if multiple clients on a single host) would be some solr > specific id that uniquely identifies each node. > > Patrick > > > Benjamin Reed wrote: > >> i second ted's proposals! thanx ted. >> >> there is one other option. when you create the ZooKeeper object you can >> pass a session id and password. your bounced server can actually reattach to >> the session. (that is why we put that constructor in.) to use it you need to >> save the session id and password to a persistent store (a file) when you >> first attach, and then when you restart read the id and password from the >> file. >> >> ben >> >> Ted Dunning wrote: >> >>> On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley >> >wrote: >>> >>> >>> >>>> There's no way to "hand over" responsibility for an ephemeral znode, >>>> right? >>>> >>>> >>>> >>> >>> Right. >>> >>> >>> >>> >>>> We have solr nodes create ephemeral znodes (name based on host and >>>> port). >>>> The ephemeral znode takes some time to remove of course, so what >>>> happens is that if I bounce a solr server (containing a zk client) the >>>> ephemeral node will still exist when the server comes back up. >>>> >>>> >>>> >>> >>> This problem comes up with any system that has hysteresis and needs a >>> single >>> point of control. >>> >>> >>> >>> >>>> What's the best way to handle this situation? Delete and re-create? >>>> >>>> >>>> >>> Watch it and re-create when it does disappear? >>> I think you need to handle the problem of multiple search nodes coming >>> up on >>> the same machine, possibly because the old one may have hung up. >>> >>> So... I would recommend >>> >>> a) if the ephemeral still exists, wait for a few more seconds to see if >>> it >>> disappears (20?) >>> >>> b) if it goes away, create a new one and continue as normal >>> >>> c) if it doesn't go away take additional action to determine if service >>> is >>> still running (i.e. panic and run in circles). >>> >>> >> >> --00c09f8de4f02a65a5047ece4d0f--