zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carroll James (Nokia-LC/Malvern)" <james.carr...@nokia.com>
Subject Unrecoverable ConnectionLossException after server restart
Date Wed, 28 Nov 2012 06:18:59 GMT
I'm seeing (what I think) is incorrect behavior from ZooKeeper.

When I start a client, connect to a server, and then restart the server, the client (I thought)
was supposed to eventually reconnect. It doesn't. It continually throws a ConnectionLossException
on every use, the ZooKeeper client isAlive is true, I never get a SESSION_EXPIRATION, and
I can see the client side ephemeral ports listed in the error message counting up as if it's
continually attempting to reconnect.

If I recreate the ZooKeeper client, the new client connects and I can use it.

So I could simply react as if I got a SESSION_EXPIRATION exception and rebuild the client
state, except the a ConnectionLossException is something I ALSO get when I get a network partition.
When I periodically recreate the entire client from scratch in response to a ConnectionLossException
I eventually run out of file descriptors and my entire process is hosed. This seems to be
related to the use of nio and the repeated opening of pipes and anon_inodes (which show up
in an lsof).

Am I doing something wrong? Any suggestions?

The information contained in this communication may be CONFIDENTIAL and is intended only for
the use of the recipient(s) named above.  If you are not the intended recipient, you are hereby
notified that any dissemination, distribution, or copying of this communication, or any of
its contents, is strictly prohibited.  If you have received this communication in error, please
notify the sender and delete/destroy the original message and any copy of it from your computer
or paper files.

View raw message