zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Serrano <mar...@attivio.com>
Subject RE: disconnects and auto renewal
Date Tue, 13 Sep 2011 11:58:09 GMT

Sorry to trouble you on this one.  I do understand the difference, but at some point I did
not.  :)

Your question inspired me to look deeper at our code (to see if we were confused) and I found
one case that was triggering our reconnect response from Disconnected event.  Everywhere else
we only do this in response to a SessionExpiredException.

Thanks for the quick response and your work on ZooKeeper in general!  I have also run into
the "can't create ephemeral yet case" and our code generally loops until successful.


> -----Original Message-----
> From: Ted Dunning [mailto:ted.dunning@gmail.com]
> Martin,
> From your email, it sounds like there might be a bit of confusion between
> disconnection and session expiration.  Are you sure you are clear on the
> difference between these?
> Also, I have seen cases in my own code where I confused myself by trying to
> re-create ephemeral files after a client program crashed.  I knew that the
> client had crashed as soon as it happened, but the Zookeeper servers could
> only determine this after a bit of time.  My new program tried to recreate the
> ephemerals to indicate that it was back but since the old ephemerals were
> still there, that failed.  Then a short time later when the ZK cluster
> understood that the old client was gone, the ephemerals disappeared even
> though the new client was humming along nicely.  My solution was to delete
> the ephemerals when creating them.
> Is it possible you have a similar confusion?
> On Tue, Sep 13, 2011 at 11:25 AM, Martin Serrano <martin@attivio.com>
> wrote:
> > Hi,
> >
> > We have added code to our application to reconnect and re-establish
> > watches when we receive a Disconnected event.  I am running tests on a
> > heavily loaded system where the zookeeper server and clients are all
> > impacted.  On this test system we regularly experience session
> > timeouts and appropriately react to reconnect and set up our watches.
> > There is an uncommon case that I am having trouble puzzling out.  When
> > running one of our tests in a loop about 1% of the time we hit a case where
> on the client side we think the
> > session has expired but on the server side it has been renewed.   We will
> > then fail to be able to create an ephemeral node because it already
> > exists and does not ever get cleaned up (since the previous session is
> > still valid).  I'm trying to figure out if we are misusing the API or if we have
> > encountered a bug.   I'm happy to provide more details.  One thing I am
> > wondering is if it is inappropriate to create a new session within the
> > event thread of another session which has received the disconnected
> event.
> >
> > Thanks,
> > Martin Serrano
> > ...

View raw message