zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: Ephemeral node bound to a session that times out while ZK has no quorum
Date Sat, 17 May 2014 02:22:59 GMT
Thanks Flavio,
This probably explains the situation, but I will have to check the logs
again to be sure. It seemed like the ephemeral node didn't get cleaned up
for an extended period of time even though the client had established a new
connection. Could possibly be some weirdness where the old session was
still alive because it hadn't been closed down properly, but this seems
unlikely.

Anyway, thanks for the link.
cheers


On Fri, May 16, 2014 at 8:13 PM, FPJ <fpjunqueira@yahoo.com> wrote:

> Hi Cameron,
>
> The last point of the FAQ might clarify why the ephemerals are not getting
> deleted when the cluster is coming back up:
>
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/FAQ
>
> -Flavio
>
> > -----Original Message-----
> > From: Cameron McKenzie [mailto:mckenzie.cam@gmail.com]
> > Sent: 08 May 2014 11:42
> > To: zookeeper-user@hadoop.apache.org
> > Subject: Re: Ephemeral node bound to a session that times out while ZK
> has
> > no quorum
> >
> > After a few more trials, unfortunately it seems completely random as to
> how
> > long the ephemeral nodes are sticking around. Sometime's it's minutes,
> > sometime's they're cleaned up in a matter of seconds after startup...
> >
> >
> > On Thu, May 8, 2014 at 8:31 PM, Cameron McKenzie
> > <mckenzie.cam@gmail.com>wrote:
> >
> > > Sorry, bashed send prematurely!
> > >
> > > Guys,
> > > I've noticed a weird problem with ephemeral nodes not being cleaned up
> > > if the session they are tied to times out while ZooKeeper does not
> > > have a quorum. The situation is basically as follows:
> > >
> > > 3 node cluster
> > > -Client connects to cluster and creates an ephemeral node -Two nodes
> > > die, so quorum is lost -Some time passes (longer than the session
> > > timeout negotiated for the client that created the ephemeral node)
> > > -One (or both) of the dead nodes come back and a quorum is reformed.
> > > -The ephemeral node tied to the session which should have timed out
> > > still exists and never seems to get cleaned up.
> > > -If I telnet in on port 2181 and 'dump', then I can see that ZK seems
> > > to think that the session is still active and associated with the
> > > ephemeral node in question.
> > > -It seems to stay in this state for some extended period of time (20+
> > > minutes). Interestingly, when I happened to fire up zkCli.sh I could
> > > see that the node was still there, but after I exited, the node seemed
> > > to disappear shortly afterwards. So, I wonder if the session
> > > established by zkCli.sh ending somehow triggered the cleanup of this
> rogue
> > ephemeral node?
> > >
> > > Has anyone experience this issue before? I understand that it's a bit
> > > of an edge case, but I'm running across it quite frequently when
> > > testing changing the size of ZK cluster.
> > >
> > > I've thought of a few work arounds for the issue, but I'd like to know
> > > if it's a known issue.
> > >
> > > Any help appreciated!
> > > cheers
> > >
> > >
> > >
> > > On Thu, May 8, 2014 at 8:15 PM, Cameron McKenzie
> > <mckenzie.cam@gmail.com>wrote:
> > >
> > >> Guys,
> > >> I've noticed a weird problem with ephemeral nodes not being cleaned
> > >> up if the session they are tied to times out while ZooKeeper does not
> > >> have a quorum. The situation is basically as follows:
> > >>
> > >> 3 node cluster
> > >> -Client connects to cluster and creates an ephemeral node -Two nodes
> > >> die, so quorum is lost -Some time passes (longer than the session
> > >> timeout negotiated for the client that created the ephemeral node)
> > >> -One (or both) of the dead nodes come back and a quorum is reformed.
> > >> -The ephemeral node tied to the session which should have timed out
> > >> still exists
> > >>
> > >>
> > >
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message