curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bae, Jae Hyeon" <metac...@gmail.com>
Subject Re: curator-2.4.0 cannot recover connection loss
Date Thu, 10 Apr 2014 16:50:31 GMT
Hi Osman

Thank you for testing. If I can reproduce this problem, I will test
zookeeper 3.4.6 and curator 2.4.1 combination.


On Thu, Apr 10, 2014 at 9:11 AM, Osman <osmanscam@gmail.com> wrote:

> Hi Jae;
>
> just letting you know that, using zookeeper 3.4.6 and curator 2.4.1, I
> could not verify your case in my environment.
> It would be nice If see this problem in my environment, How can I
> elaborate that?
>
> After starting the application (using PathChildrenCacheListener) , I stop
> the zookeeper and 40 seconds after restart it.
> Application switch to RECONNECTED state after  SUSPENDED state , reporting
> ConnectionLoss.
> (After 30 minutes checking logs, It did not go back to SUSPENDED state
> ,still connected and listening the children node changes.)
>
> java.io.IOException: An existing connection was forcibly closed by the
> remote host
> 08:40:34.464 [main-EventThread] INFO  o.a.c.f.state.ConnectionStateManager
> - State change: SUSPENDED
> 08:40:34.473 [PathChildrenCache-0] ERROR o.a.c.f.r.cache.PathChildrenCache
> -
> 08:40:40.198 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
> - Connection attempt unsuccessful after 2000 (greater than max timeout of
> 500). Resetting connection and trying again with a new connection.
> 08:40:40.198 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
> Closing session: 0x0
> 08:40:40.198 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
> Closing client for session: 0x0
> 08:40:42.344 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
> - Connection attempt unsuccessful after 2146 (greater than max timeout of
> 500). Resetting connection and trying again with a new connection.
> 08:40:42.344 [CuratorFramework-0] DEBUG org.apache.curator.ConnectionState
> - reset
> 08:40:42.344 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
> Closing session: 0x0
> 08:40:42.344 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
> Closing client for session: 0x0
> 08:40:42.403 [CuratorFramework-0-SendThread(127.0.0.1:2181)] INFO
>  org.apache.zookeeper.ClientCnxn - Opening socket connection to server
> 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL
> (unknown error)
> 08:40:42.409 [CuratorFramework-0] ERROR o.a.c.f.imps.CuratorFrameworkImpl
> - Background operation retry gave up
> org.apache.zookeeper.KeeperException$ConnectionLossException:
> KeeperErrorCode = ConnectionLoss
> 08:40:42.410 [CuratorFramework-0] ERROR o.a.c.f.imps.CuratorFrameworkImpl
> - Background retry gave up
> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
> ConnectionLoss
> 08:40:46.920 [CuratorFramework-0] WARN  org.apache.curator.ConnectionState
> - Connection attempt unsuccessful after 1389 (greater than max timeout of
> 500). Resetting connection and trying again with a new connection.
> 08:40:46.920 [CuratorFramework-0] DEBUG org.apache.curator.ConnectionState
> - reset
> 08:40:46.920 [CuratorFramework-0] DEBUG org.apache.zookeeper.ZooKeeper -
> Closing session: 0x0
> 08:40:46.920 [CuratorFramework-0] DEBUG org.apache.zookeeper.ClientCnxn -
> Closing client for session: 0x0
> 08:41:14.303 [CuratorFramework-0-SendThread(0:0:0:0:0:0:0:1:2181)] DEBUG
> o.a.zookeeper.ClientCnxnSocketNIO - Ignoring exception during shutdown input
> java.net.SocketException: Socket is not connected
>
> Then After starting zookeeper instance Path Children Cache Continue to get
> updated
> 08:41:15.804 [CuratorFramework-0-EventThread] INFO
>  o.a.c.f.state.ConnectionStateManager - State change: RECONNECTED
>
>
>
>
> Regards.
>
>
>
>
>
>
>
>
>
>
>
>
>
> On 9 April 2014 18:55, Bae, Jae Hyeon <metacret@gmail.com> wrote:
>
>> Last night, I rolling-restarted zookeeper 3.4.5 to update configuration
>> and I saw curator-2.4.0 couldn't recover connection loss.
>>
>> ERROR 2014-04-09 17:48:15,231 [DaemonThreadFactory-2-thread-2]
>> org.apache.curator.framework.imps.CuratorFrameworkImpl: Background retry
>> gave up
>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>> ConnectionLoss
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:766)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:724)
>>
>> INFO  2014-04-09 17:48:15,276 [ServerInventoryView-0-EventThread]
>> org.apache.curator.framework.state.ConnectionStateManager: State change:
>> RECONNECTED
>> INFO  2014-04-09 17:48:15,382 [ServerInventoryView-0-EventThread]
>> org.apache.curator.framework.state.ConnectionStateManager: State change:
>> SUSPENDED
>> ERROR 2014-04-09 17:48:15,748 [DaemonThreadFactory-2-thread-2]
>> org.apache.curator.framework.imps.CuratorFrameworkImpl: Background
>> exception was not retry-able or retry gave up
>> java.lang.NullPointerException
>>         at
>> com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191)
>>         at
>> com.google.common.collect.Lists$TransformingSequentialList.<init>(Lists.java:527)
>>         at com.google.common.collect.Lists.transform(Lists.java:510)
>>         at
>> org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:635)
>>         at
>> org.apache.curator.framework.recipes.cache.PathChildrenCache.access$200(PathChildrenCache.java:68)
>>         at
>> org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:476)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:686)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:659)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:783)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56)
>>         at
>> org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244)
>>         at
>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>         at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>         at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:724)
>>
>> I am not sure this bug is on PathChildrenCache.
>>
>> I need to restart all instances using curator-2.4.0, which is really bad.
>>
>> Thank you
>> Best, Jae
>>
>
>
>
> --
> Osman Sebati Çam
>
> https://twitter.com/osmanscam <https://twitter.com/#!/osmanscam>
> http://osmanscam.blogspot.ie
>
>
>
>

Mime
View raw message