curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Osman <osmans...@gmail.com>
Subject Re: curator-2.4.0 cannot recover connection loss
Date Thu, 10 Apr 2014 16:11:23 GMT
Hi Jae;

just letting you know that, using zookeeper 3.4.6 and curator 2.4.1, I
could not verify your case in my environment.
It would be nice If see this problem in my environment, How can I elaborate
that?

After starting the application (using PathChildrenCacheListener) , I stop
the zookeeper and 40 seconds after restart it.
Application switch to RECONNECTED state after  SUSPENDED state , reporting
ConnectionLoss.
(After 30 minutes checking logs, It did not go back to SUSPENDED state
,still connected and listening the children node changes.)

java.io.IOException: An existing connection was forcibly closed by the
remote host
08:40:34.464 [main-EventThread] INFO  o.a.c.f.state.ConnectionStateManager
- State change: SUSPENDED
08:40:34.473 [PathChildrenCache-0] ERROR o.a.c.f.r.cache.PathChildrenCache
-
08:40:40.198 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
- Connection attempt unsuccessful after 2000 (greater than max timeout of
500). Resetting connection and trying again with a new connection.
08:40:40.198 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
Closing session: 0x0
08:40:40.198 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
Closing client for session: 0x0
08:40:42.344 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
- Connection attempt unsuccessful after 2146 (greater than max timeout of
500). Resetting connection and trying again with a new connection.
08:40:42.344 [CuratorFramework-0] DEBUG org.apache.curator.ConnectionState
- reset
08:40:42.344 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
Closing session: 0x0
08:40:42.344 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
Closing client for session: 0x0
08:40:42.403 [CuratorFramework-0-SendThread(127.0.0.1:2181)] INFO
 org.apache.zookeeper.ClientCnxn - Opening socket connection to server
127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
(unknown error)
08:40:42.409 [CuratorFramework-0] ERROR o.a.c.f.imps.CuratorFrameworkImpl -
Background operation retry gave up
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss
08:40:42.410 [CuratorFramework-0] ERROR o.a.c.f.imps.CuratorFrameworkImpl -
Background retry gave up
org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
ConnectionLoss
08:40:46.920 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
- Connection attempt unsuccessful after 1389 (greater than max timeout of
500). Resetting connection and trying again with a new connection.
08:40:46.920 [CuratorFramework-0] DEBUG org.apache.curator.ConnectionState
- reset
08:40:46.920 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
Closing session: 0x0
08:40:46.920 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
Closing client for session: 0x0
08:41:14.303 [CuratorFramework-0-SendThread(0:0:0:0:0:0:0:1:2181)] DEBUG
o.a.zookeeper.ClientCnxnSocketNIO - Ignoring exception during shutdown input
java.net.SocketException: Socket is not connected

Then After starting zookeeper instance Path Children Cache Continue to get
updated
08:41:15.804 [CuratorFramework-0-EventThread] INFO
 o.a.c.f.state.ConnectionStateManager - State change: RECONNECTED




Regards.













On 9 April 2014 18:55, Bae, Jae Hyeon <metacret@gmail.com> wrote:

> Last night, I rolling-restarted zookeeper 3.4.5 to update configuration
> and I saw curator-2.4.0 couldn't recover connection loss.
>
> ERROR 2014-04-09 17:48:15,231 [DaemonThreadFactory-2-thread-2]
> org.apache.curator.framework.imps.CuratorFrameworkImpl: Background retry
> gave up
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> ConnectionLoss
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:766)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
>
> INFO  2014-04-09 17:48:15,276 [ServerInventoryView-0-EventThread]
> org.apache.curator.framework.state.ConnectionStateManager: State change:
> RECONNECTED
> INFO  2014-04-09 17:48:15,382 [ServerInventoryView-0-EventThread]
> org.apache.curator.framework.state.ConnectionStateManager: State change:
> SUSPENDED
> ERROR 2014-04-09 17:48:15,748 [DaemonThreadFactory-2-thread-2]
> org.apache.curator.framework.imps.CuratorFrameworkImpl: Background
> exception was not retry-able or retry gave up
> java.lang.NullPointerException
>         at
> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
>         at
> com.google.common.collect.Lists$TransformingSequentialList.<init>(Lists.java:527)
>         at com.google.common.collect.Lists.transform(Lists.java:510)
>         at
> org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:635)
>         at
> org.apache.curator.framework.recipes.cache.PathChildrenCache.access$200(PathChildrenCache.java:68)
>         at
> org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:476)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:686)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:659)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:783)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56)
>         at
> org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244)
>         at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>         at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:724)
>
> I am not sure this bug is on PathChildrenCache.
>
> I need to restart all instances using curator-2.4.0, which is really bad.
>
> Thank you
> Best, Jae
>



-- 
Osman Sebati Çam

https://twitter.com/osmanscam <https://twitter.com/#!/osmanscam>
http://osmanscam.blogspot.ie

Mime
View raw message