hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: ephemeral node after server bounce
Date Thu, 04 Feb 2010 23:04:29 GMT
Ah, excellent idea, won't always work but may help. I think in this case 
(ephemerals) all Yonik would need to do is close the session. That will 
remove all ephemerals.

Patrick

kishore g wrote:
> Worst case option would be to have jvm shutdownhooks
> http://stackoverflow.com/questions/40376/handle-signals-in-the-java-virtual-machine
> 
> You can delete the znodes on exit. More like deleteOnExit functionality of a
> File
> 
> thanks,
> Kishore G
> 
> 
> 
> On Thu, Feb 4, 2010 at 2:56 PM, Patrick Hunt <phunt@apache.org> wrote:
> 
>> hah, you guys beat me to the punch. I think having some unique per client
>> token might also work (see my resp). Perhaps this is the ip of the host or
>> better (esp if multiple clients on a single host) would be some solr
>> specific id that uniquely identifies each node.
>>
>> Patrick
>>
>>
>> Benjamin Reed wrote:
>>
>>> i second ted's proposals! thanx ted.
>>>
>>> there is one other option. when you create the ZooKeeper object you can
>>> pass a session id and password. your bounced server can actually reattach to
>>> the session. (that is why we put that constructor in.) to use it you need to
>>> save the session id and password to a persistent store (a file) when you
>>> first attach, and then when you restart read the id and password from the
>>> file.
>>>
>>> ben
>>>
>>> Ted Dunning wrote:
>>>
>>>> On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley <yonik@lucidimagination.com
>>>>> wrote:
>>>>
>>>>
>>>>> There's no way to "hand over" responsibility for an ephemeral znode,
>>>>> right?
>>>>>
>>>>>
>>>>>
>>>> Right.
>>>>
>>>>
>>>>
>>>>
>>>>> We have solr nodes create ephemeral znodes (name based on host and
>>>>> port).
>>>>> The ephemeral znode takes some time to remove of course, so what
>>>>> happens is that if I bounce a solr server (containing a zk client) the
>>>>> ephemeral node will still exist when the server comes back up.
>>>>>
>>>>>
>>>>>
>>>> This problem comes up with any system that has hysteresis and needs a
>>>> single
>>>> point of control.
>>>>
>>>>
>>>>
>>>>
>>>>> What's the best way to handle this situation?  Delete and re-create?
>>>>>
>>>>>
>>>>>
>>>> Watch it and re-create when it does disappear?
>>>>  I think you need to handle the problem of multiple search nodes coming
>>>> up on
>>>> the same machine, possibly because the old one may have hung up.
>>>>
>>>> So... I would recommend
>>>>
>>>> a) if the ephemeral still exists, wait for a few more seconds to see if
>>>> it
>>>> disappears (20?)
>>>>
>>>> b) if it goes away, create a new one and continue as normal
>>>>
>>>> c) if it doesn't go away take additional action to determine if service
>>>> is
>>>> still running (i.e. panic and run in circles).
>>>>
>>>>
>>>
> 

Mime
View raw message