zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: Ephemeral node bound to a session that times out while ZK has no quorum
Date Thu, 08 May 2014 10:41:54 GMT
After a few more trials, unfortunately it seems completely random as to how
long the ephemeral nodes are sticking around. Sometime's it's minutes,
sometime's they're cleaned up in a matter of seconds after startup...


On Thu, May 8, 2014 at 8:31 PM, Cameron McKenzie <mckenzie.cam@gmail.com>wrote:

> Sorry, bashed send prematurely!
>
> Guys,
> I've noticed a weird problem with ephemeral nodes not being cleaned up if
> the session they are tied to times out while ZooKeeper does not have a
> quorum. The situation is basically as follows:
>
> 3 node cluster
> -Client connects to cluster and creates an ephemeral node
> -Two nodes die, so quorum is lost
> -Some time passes (longer than the session timeout negotiated for the
> client that created the ephemeral node)
> -One (or both) of the dead nodes come back and a quorum is reformed.
> -The ephemeral node tied to the session which should have timed out still
> exists and never seems to get cleaned up.
> -If I telnet in on port 2181 and 'dump', then I can see that ZK seems to
> think that the session is still active and associated with the ephemeral
> node in question.
> -It seems to stay in this state for some extended period of time (20+
> minutes). Interestingly, when I happened to fire up zkCli.sh I could see
> that the node was still there, but after I exited, the node seemed to
> disappear shortly afterwards. So, I wonder if the session established by
> zkCli.sh ending somehow triggered the cleanup of this rogue ephemeral node?
>
> Has anyone experience this issue before? I understand that it's a bit of
> an edge case, but I'm running across it quite frequently when testing
> changing the size of ZK cluster.
>
> I've thought of a few work arounds for the issue, but I'd like to know if
> it's a known issue.
>
> Any help appreciated!
> cheers
>
>
>
> On Thu, May 8, 2014 at 8:15 PM, Cameron McKenzie <mckenzie.cam@gmail.com>wrote:
>
>> Guys,
>> I've noticed a weird problem with ephemeral nodes not being cleaned up if
>> the session they are tied to times out while ZooKeeper does not have a
>> quorum. The situation is basically as follows:
>>
>> 3 node cluster
>> -Client connects to cluster and creates an ephemeral node
>> -Two nodes die, so quorum is lost
>> -Some time passes (longer than the session timeout negotiated for the
>> client that created the ephemeral node)
>> -One (or both) of the dead nodes come back and a quorum is reformed.
>> -The ephemeral node tied to the session which should have timed out still
>> exists
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message