zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jordan Zimmerman <jor...@jordanzimmerman.com>
Subject Re: ZK session expiration and recovery
Date Fri, 18 Jul 2014 15:07:55 GMT
You might consider using Curator (http://curator.apache.org). One of it’s main features is
ZooKeeper connection management.


On July 18, 2014 at 9:59:56 AM, Ahmed H. (ahmed.hammad@gmail.com) wrote:


I am having some issues where the Zookeeper connection loss occurs. This  
affects various things in my application, namely watchers, which result in  
errors like the one below:  

23:13:01,593 ERROR [org.apache.zookeeper.ClientCnxn]  
(pool-5-thread-1-EventThread) Error while calling watcher :  
KeeperErrorCode = Session expired for /controller/resync  
at org.apache.zookeeper.KeeperException.create(KeeperException.java:118)  
at org.apache.zookeeper.KeeperException.create(KeeperException.java:42)  
at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1249)  
at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) [:1.7.0_51]  
at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_51]  
at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)  
at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28)  
at zookeeper$children.doInvoke(zookeeper.clj:230) at  
clojure.lang.RestFn.invoke(RestFn.java:464) [clojure-1.5.1.jar:]  
at resync$resync_group_watcher.invoke(resync.clj:26)  
at zookeeper.internal$make_watcher$reify__10446.process(internal.clj:56)  
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507)  

I guess I have a few questions that might help me mitigate this issue. I  
could try to fix whatever is causing the session expiration. This issue  
occurs when we have a lot of activity on the machine, which leads me to  
believe that it might be caused by GC activity (based on the ZK guide).  
This might work, but it seems to me like we would just be masking the issue  
and eventually, it might happen again.  

The other issue is that our client never recovers. It's completely dead. Is  
there a way to make it auto reconnect after it dies? Does Zookeeper support  
such functionality?  

Are there any other things I should be aware of or any recommendations you  
have for setting up a Zookeeper environment? For the record, we are running  
version 3.4.5 in a single node setup.  


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message