zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anthony Shaya <ash...@workforcesoftware.com>
Subject Zookeeper session expiration
Date Mon, 04 Dec 2017 15:22:59 GMT

I have a question about zookeeper sessions (within a cluster). I've noticed in our production
servers that some of our clients lose connection to zookeeper then our client application
 comes down (and we automatically bring it back and it reconnects to zk just fine). It seems
to be a session expiration failure (from the investigation done so far).

2017-09-14 09:26:11,516 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Opening
socket connection to server *** /***:2181. Will not attempt to authenticate using SASL (unknown
2017-09-14 09:26:11,517 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Socket
connection established to *** /***:2181, initiating session
2017-09-14 09:26:11,519 org.apache.zookeeper.ClientCnxn DEBUG [main-SendThread(***:2181)]
Session establishment request sent on *** /***:2181
2017-09-14 09:26:11,520 org.apache.zookeeper.ClientCnxn INFO [main-SendThread(***:2181)] Unable
to reconnect to ZooKeeper service, session 0x3256a2e5b6090079 has expired, closing socket

My question is related to how session expiration works, I noticed on many of the client machines
the times across these machines were all off (by anywhere from 1 minute to 20 minutes - which
was resolved after discovery - haven't verified this completely yet). Can this directly affect
session expiration within the zookeeper cluster?

  *   I read the following in https://wiki.apache.org/hadoop/ZooKeeper/FAQ , "Expirations
happens when the cluster does not hear from the client within the specified session timeout
period (i.e. no heartbeat).". So in some case it seems like if the times were wrong across
the machines its possible one of the clients could of effectively sent a heart beat in the
past (not sure about this tbh) and then the cluster expires the session?

  *   I don't have the zookeeper node log for the above time to see what was going on in zookeeper
when the cluster determined the session expired.

  *   Is there any additional logging I can turn on to troubleshoot zk session expiration


This message is intended exclusively for the individual or entity to which it is addressed.
This communication may contain information that is proprietary, privileged, confidential or
otherwise legally exempt from disclosure. If you are not the named addressee, or have been
inadvertently and erroneously referenced in the address line, you are not authorized to read,
print, retain, copy or disseminate this message or any part of it. If you have received this
message in error, please notify the sender immediately by e-mail and delete all copies of
the message. (ID m031214)

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message