hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From kishore g <g.kish...@gmail.com>
Subject Re: ephemeral node after server bounce
Date Thu, 04 Feb 2010 23:00:29 GMT
Worst case option would be to have jvm shutdownhooks
http://stackoverflow.com/questions/40376/handle-signals-in-the-java-virtual-machine

You can delete the znodes on exit. More like deleteOnExit functionality of a
File

thanks,
Kishore G



On Thu, Feb 4, 2010 at 2:56 PM, Patrick Hunt <phunt@apache.org> wrote:

> hah, you guys beat me to the punch. I think having some unique per client
> token might also work (see my resp). Perhaps this is the ip of the host or
> better (esp if multiple clients on a single host) would be some solr
> specific id that uniquely identifies each node.
>
> Patrick
>
>
> Benjamin Reed wrote:
>
>> i second ted's proposals! thanx ted.
>>
>> there is one other option. when you create the ZooKeeper object you can
>> pass a session id and password. your bounced server can actually reattach to
>> the session. (that is why we put that constructor in.) to use it you need to
>> save the session id and password to a persistent store (a file) when you
>> first attach, and then when you restart read the id and password from the
>> file.
>>
>> ben
>>
>> Ted Dunning wrote:
>>
>>> On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley <yonik@lucidimagination.com
>>> >wrote:
>>>
>>>
>>>
>>>> There's no way to "hand over" responsibility for an ephemeral znode,
>>>> right?
>>>>
>>>>
>>>>
>>>
>>> Right.
>>>
>>>
>>>
>>>
>>>> We have solr nodes create ephemeral znodes (name based on host and
>>>> port).
>>>> The ephemeral znode takes some time to remove of course, so what
>>>> happens is that if I bounce a solr server (containing a zk client) the
>>>> ephemeral node will still exist when the server comes back up.
>>>>
>>>>
>>>>
>>>
>>> This problem comes up with any system that has hysteresis and needs a
>>> single
>>> point of control.
>>>
>>>
>>>
>>>
>>>> What's the best way to handle this situation?  Delete and re-create?
>>>>
>>>>
>>>>
>>> Watch it and re-create when it does disappear?
>>>  I think you need to handle the problem of multiple search nodes coming
>>> up on
>>> the same machine, possibly because the old one may have hung up.
>>>
>>> So... I would recommend
>>>
>>> a) if the ephemeral still exists, wait for a few more seconds to see if
>>> it
>>> disappears (20?)
>>>
>>> b) if it goes away, create a new one and continue as normal
>>>
>>> c) if it doesn't go away take additional action to determine if service
>>> is
>>> still running (i.e. panic and run in circles).
>>>
>>>
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message