zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <cammcken...@apache.org>
Subject Reconnection with expired session
Date Wed, 12 Nov 2014 02:08:57 GMT
Guys,
I have a (possibly somewhat contrived) issue relating to reconnection of a
client to ZK after quorum has been lost, and data has been corrupted.

Essentially this is what's happening:
-Client connects to 3 node ZK cluster
-Client writes some ephemeral zNodes etc.
-All nodes in ZK cluster are shut down
-Contents of data/version-2 directories are removed on each ZK instance
(i.e. the acceptedEpoch, currentEpoch and all the snapshots and tran logs)
-Restart the nodes in the ZK cluster

At this point, the ZK cluster comes up fine, but the client will not
automatically reconnect. Having stepped through the client code with a
debugger it seems like the server just doesn't respond to the session
initialisation request). These are the logs, which are repeated every
second. Note that if I restart the client, everything's fine.

12:56:35.978 [main-SendThread(ubuntubox:2181)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
ubuntubox/192.168.56.102:2181. Will not attempt to authenticate using SASL
(unknown error)
12:56:35.980 [main-SendThread(ubuntubox:2181)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
ubuntubox/192.168.56.102:2181, initiating session
12:56:35.983 [main-SendThread(ubuntubox:2181)] DEBUG
org.apache.zookeeper.ClientCnxn - Session establishment request sent on
ubuntubox/192.168.56.102:2181
12:56:36.002 [main-SendThread(ubuntubox:2181)] INFO
org.apache.zookeeper.ClientCnxn - Unable to read additional data from
server sessionid 0x249a1b64cc90000, likely server has closed socket,
closing socket connection and attempting reconnect
12:56:37.833 [main-SendThread(ubuntubox:2182)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
ubuntubox/192.168.56.102:2182. Will not attempt to authenticate using SASL
(unknown error)
12:56:37.834 [main-SendThread(ubuntubox:2182)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
ubuntubox/192.168.56.102:2182, initiating session
12:56:37.835 [main-SendThread(ubuntubox:2182)] DEBUG
org.apache.zookeeper.ClientCnxn - Session establishment request sent on
ubuntubox/192.168.56.102:2182
12:56:37.859 [main-SendThread(ubuntubox:2182)] INFO
org.apache.zookeeper.ClientCnxn - Unable to read additional data from
server sessionid 0x249a1b64cc90000, likely server has closed socket,
closing socket connection and attempting reconnect
12:56:38.298 [main-SendThread(ubuntubox:2183)] INFO
org.apache.zookeeper.ClientCnxn - Opening socket connection to server
ubuntubox/192.168.56.102:2183. Will not attempt to authenticate using SASL
(unknown error)
12:56:38.299 [main-SendThread(ubuntubox:2183)] INFO
org.apache.zookeeper.ClientCnxn - Socket connection established to
ubuntubox/192.168.56.102:2183, initiating session
12:56:38.300 [main-SendThread(ubuntubox:2183)] DEBUG
org.apache.zookeeper.ClientCnxn - Session establishment request sent on
ubuntubox/192.168.56.102:2183

Can someone explain what's going on? Is this a bug? While I understand that
it's slightly contrived, the destruction of the data is certainly a
possibility, and having to restart every client even when the cluster comes
back up is not ideal.
cheers
Cam

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message