Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: zookeeper-user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
Message-ID: <4B6B527D.4030609@apache.org>
Date: Thu, 04 Feb 2010 15:04:29 -0800
From: Patrick Hunt <phunt@apache.org>
User-Agent: Thunderbird 2.0.0.23 (X11/20090817)
MIME-Version: 1.0
To: zookeeper-user@hadoop.apache.org
CC: "yonik@lucidimagination.com" <yonik@lucidimagination.com>
Subject: Re: ephemeral node after server bounce
References: <c68e39171002041420n6132cc26p4f0beb52bb359262@mail.gmail.com>
	 <c7d45fc71002041425y3b0c53bbr94f9777ad1cc3e59@mail.gmail.com>
	 <4B6B4B89.6070603@yahoo-inc.com> <4B6B508C.4000406@apache.org>
 <bc9f0f391002041500g590e0e91of328c85ead94ea6b@mail.gmail.com>
In-Reply-To: <bc9f0f391002041500g590e0e91of328c85ead94ea6b@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Ah, excellent idea, won't always work but may help. I think in this case 
(ephemerals) all Yonik would need to do is close the session. That will 
remove all ephemerals.

Patrick

kishore g wrote:
> Worst case option would be to have jvm shutdownhooks
> http://stackoverflow.com/questions/40376/handle-signals-in-the-java-virtual-machine
> 
> You can delete the znodes on exit. More like deleteOnExit functionality of a
> File
> 
> thanks,
> Kishore G
> 
> 
> 
> On Thu, Feb 4, 2010 at 2:56 PM, Patrick Hunt <phunt@apache.org> wrote:
> 
>> hah, you guys beat me to the punch. I think having some unique per client
>> token might also work (see my resp). Perhaps this is the ip of the host or
>> better (esp if multiple clients on a single host) would be some solr
>> specific id that uniquely identifies each node.
>>
>> Patrick
>>
>>
>> Benjamin Reed wrote:
>>
>>> i second ted's proposals! thanx ted.
>>>
>>> there is one other option. when you create the ZooKeeper object you can
>>> pass a session id and password. your bounced server can actually reattach to
>>> the session. (that is why we put that constructor in.) to use it you need to
>>> save the session id and password to a persistent store (a file) when you
>>> first attach, and then when you restart read the id and password from the
>>> file.
>>>
>>> ben
>>>
>>> Ted Dunning wrote:
>>>
>>>> On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley <yonik@lucidimagination.com
>>>>> wrote:
>>>>
>>>>
>>>>> There's no way to "hand over" responsibility for an ephemeral znode,
>>>>> right?
>>>>>
>>>>>
>>>>>
>>>> Right.
>>>>
>>>>
>>>>
>>>>
>>>>> We have solr nodes create ephemeral znodes (name based on host and
>>>>> port).
>>>>> The ephemeral znode takes some time to remove of course, so what
>>>>> happens is that if I bounce a solr server (containing a zk client) the
>>>>> ephemeral node will still exist when the server comes back up.
>>>>>
>>>>>
>>>>>
>>>> This problem comes up with any system that has hysteresis and needs a
>>>> single
>>>> point of control.
>>>>
>>>>
>>>>
>>>>
>>>>> What's the best way to handle this situation?  Delete and re-create?
>>>>>
>>>>>
>>>>>
>>>> Watch it and re-create when it does disappear?
>>>>  I think you need to handle the problem of multiple search nodes coming
>>>> up on
>>>> the same machine, possibly because the old one may have hung up.
>>>>
>>>> So... I would recommend
>>>>
>>>> a) if the ephemeral still exists, wait for a few more seconds to see if
>>>> it
>>>> disappears (20?)
>>>>
>>>> b) if it goes away, create a new one and continue as normal
>>>>
>>>> c) if it doesn't go away take additional action to determine if service
>>>> is
>>>> still running (i.e. panic and run in circles).
>>>>
>>>>
>>>
>