curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Jaton <benjamin.ja...@gmail.com>
Subject Re: Curator never goes through when ZK is read-only
Date Thu, 06 Oct 2016 21:11:09 GMT
I was thinking that connectionTimeoutMs is the time it would take for 1
node to timeout.
But then once you've checked 1 node, waitTimeMs becomes < 0 and the while
loop exits without checking the other nodes.

But that's just a guess, maybe the problem is somewhere else.

On Thu, Oct 6, 2016 at 2:04 PM, Jordan Zimmerman <jordan@jordanzimmerman.com
> wrote:

> > It seems like it should be more than that since connectionTimeoutMs will
> be the time it takes to check only 1 node.
>
> Why do you think it will only check 1 node? I haven’t looked at that code
> in a long time but a quick check suggests to me that
> internalBlockUntilConnectedOrTimedOut() waits for a successful connection
> irrespective of which node it is.
>
> On Jul 30, 2016, at 12:06 AM, Benjamin Jaton <benjamin.jaton@gmail.com>
> wrote:
>
> The internalBlockUntilConnectedOrTimedOut() method will timeout after
> connectionTimeoutMs.
>
> It seems like it should be more than that since connectionTimeoutMs will
> be the time it takes to check only 1 node.
>
> Thoughts?
>
> On Fri, Jul 29, 2016 at 2:55 PM, Benjamin Jaton <benjamin.jaton@gmail.com>
> wrote:
>
>> Adding the end of the log:
>>
>> 570438 [main] DEBUG org.apache.curator.RetryLoop  - Retry policy not
>> allowing retry
>> 570438 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>> - State change: SUSPENDED
>> Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException:
>> KeeperErrorCode = ConnectionLoss for /
>>     at org.apache.zookeeper.KeeperException.create(
>> KeeperException.java:99)
>>     at org.apache.zookeeper.KeeperException.create(
>> KeeperException.java:51)
>>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956)
>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>> call(GetDataBuilderImpl.java:310)
>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>> call(GetDataBuilderImpl.java:299)
>>     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>> pathInForeground(GetDataBuilderImpl.java:295)
>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>> forPath(GetDataBuilderImpl.java:287)
>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>> forPath(GetDataBuilderImpl.java:34)
>>     at testzkclient.TestCuratorReadOnlyClient.main(
>> TestCuratorReadOnlyClient.java:25)
>>
>> On Fri, Jul 29, 2016 at 2:49 PM, Benjamin Jaton <benjamin.jaton@gmail.com
>> > wrote:
>>
>>> I was making a mistake in the ZK test, the ZK client works properly ( I
>>> was not waiting for the watcher to give me a ConnectedReadOnly event ).
>>>
>>> So I retried with zk 3.5.2 and curator 2.11.0, I have more logs (see
>>> below)
>>>
>>> 10.11.9.70:2181 is the functioning node,
>>> 10.11.157.4:2181 and 10.11.157.210:2181 don't exist.
>>>
>>> I stopped capturing at 300sec.
>>>
>>> This line:
>>> 15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient  -
>>> blockUntilConnectedOrTimedOut() end. isConnected: false
>>> is not right. Why did it stop blocking?
>>> Is there a way to get more info?
>>>
>>> 0 [main] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl  -
>>> Starting
>>> 4 [main] DEBUG org.apache.curator.CuratorZookeeperClient  - Starting
>>> 4 [main] DEBUG org.apache.curator.ConnectionState  - Starting
>>> 4 [main] DEBUG org.apache.curator.ConnectionState  - reset
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:zookeeper.version=3.5.2-alpha-1753710, built on 07/21/2016
>>> 16:24 GMT
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client environment:
>>> host.name=w-rli09-ben
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.version=1.8.0_11
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.vendor=Oracle Corporation
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.home=/usr/local/apps/jdk1.8.0_11/jre
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.class.path=/home/benji/workspace-trunk/
>>> testzkclient/bin:/home/benji/workspace-trunk/zookeeper-3.5.
>>> 2/build/lib/commons-cli-1.2.jar:/home/benji/workspace-
>>> trunk/zookeeper-3.5.2/build/lib/jackson-core-asl-1.9.11.
>>> jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/
>>> lib/jackson-mapper-asl-1.9.11.jar:/home/benji/workspace-
>>> trunk/zookeeper-3.5.2/build/lib/javacc.jar:/home/benji/
>>> workspace-trunk/zookeeper-3.5.2/build/lib/jetty-6.1.26.jar:/
>>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/
>>> jetty-util-6.1.26.jar:/home/benji/workspace-trunk/
>>> zookeeper-3.5.2/build/lib/jline-2.11.jar:/home/benji/
>>> workspace-trunk/zookeeper-3.5.2/build/lib/log4j-1.2.17.jar:/
>>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/
>>> netty-3.10.5.Final.jar:/home/benji/workspace-trunk/
>>> zookeeper-3.5.2/build/lib/servlet-api-2.5-20081211.jar:/
>>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/
>>> slf4j-api-1.7.5.jar:/home/benji/workspace-trunk/
>>> zookeeper-3.5.2/build/lib/slf4j-log4j12-1.7.5.jar:/home/
>>> benji/workspace-trunk/zookeeper-3.5.2/build/zookeeper-3.5.2-alpha.jar:/
>>> home/benji/workspace-trunk/apache-curator-3.2.0/guava-16.
>>> 0.1.jar:/home/benji/workspace-trunk/apache-curator-3.2.0/
>>> curator-client-2.11.0.jar:/home/benji/workspace-trunk/
>>> apache-curator-3.2.0/curator-framework-2.11.0.jar
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.library.path=/usr/java/packages/lib/amd64:/
>>> usr/lib64:/lib64:/lib:/usr/lib
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.io.tmpdir=/tmp
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:java.compiler=<NA>
>>> 12 [main] INFO org.apache.zookeeper.ZooKeeper  - Client environment:
>>> os.name=Linux
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:os.arch=amd64
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:os.version=3.8.0-44-generic
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client environment:
>>> user.name=benji
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:user.home=/home/benji
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:user.dir=/home/benji/workspace-trunk/testzkclient
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:os.memory.free=378MB
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:os.memory.max=5358MB
>>> 13 [main] INFO org.apache.zookeeper.ZooKeeper  - Client
>>> environment:os.memory.total=381MB
>>> 16 [main] INFO org.apache.zookeeper.ZooKeeper  - Initiating client
>>> connection, connectString=10.11.9.70:2181,10.11.157.4:2181,10.11.157.
>>> 210:2181 sessionTimeout=45000 watcher=org.apache.curator.
>>> ConnectionState@1ff8b8f
>>> 27 [main] DEBUG org.apache.zookeeper.ClientCnxnSocket  - jute.maxbuffer
>>> is 4194304
>>> 39 [main] DEBUG org.apache.curator.CuratorZookeeperClient  -
>>> blockUntilConnectedOrTimedOut() start
>>> 39 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181.
>>> Will not attempt to authenticate using SASL (unknown error)
>>> *15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient  -
>>> blockUntilConnectedOrTimedOut() end. isConnected: false*
>>> Successfully established the connection with ZooKeeper
>>> 15058 [main-SendThread(10.11.157.210:2181)] WARN org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 15023ms for
>>> sessionid 0x0
>>> 15058 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 15023ms for
>>> sessionid 0x0, closing socket connection and attempting reconnect
>>> 15059 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown input
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:198)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownInput(
>>> SocketChannelImpl.java:777)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402)
>>>     ... 3 more
>>> 15062 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown output
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:205)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownOutput(
>>> SocketChannelImpl.java:794)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410)
>>>     ... 3 more
>>> 15169 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will
>>> not attempt to authenticate using SASL (unknown error)
>>> 15170 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Socket connection established, initiating session, client: /
>>> 10.11.12.4:44742, server: 10.11.9.70/10.11.9.70:2181
>>> 15173 [main-SendThread(10.11.9.70:2181)] DEBUG org.apache.zookeeper.ClientCnxn
>>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181
>>> 15180 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181,
>>> sessionid = 0x10003fa439a0060, negotiated timeout = 40000 (READ-ONLY mode)
>>> 15182 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout
>>> 200
>>> 15187 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: READ_ONLY
>>> 142434 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Unable to read additional data from server sessionid 0x10003fa439a0060,
>>> likely server has closed socket, closing socket connection and attempting
>>> reconnect
>>> 142538 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: SUSPENDED
>>> 142545 [main] DEBUG org.apache.curator.RetryLoop  - Retry-able exception
>>> received
>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /
>>>     at org.apache.zookeeper.KeeperException.create(
>>> KeeperException.java:99)
>>>     at org.apache.zookeeper.KeeperException.create(
>>> KeeperException.java:51)
>>>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>> call(GetDataBuilderImpl.java:310)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>> call(GetDataBuilderImpl.java:299)
>>>     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> pathInForeground(GetDataBuilderImpl.java:295)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> forPath(GetDataBuilderImpl.java:287)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> forPath(GetDataBuilderImpl.java:34)
>>>     at testzkclient.TestCuratorReadOnlyClient.main(
>>> TestCuratorReadOnlyClient.java:25)
>>> 143211 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181.
>>> Will not attempt to authenticate using SASL (unknown error)
>>> 147546 [main] DEBUG org.apache.curator.RetryLoop  - Retrying operation
>>> 156556 [main-SendThread(10.11.157.210:2181)] WARN org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 14019ms for
>>> sessionid 0x10003fa439a0060
>>> 156557 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 14019ms for
>>> sessionid 0x10003fa439a0060, closing socket connection and attempting
>>> reconnect
>>> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown input
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:198)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownInput(
>>> SocketChannelImpl.java:777)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402)
>>>     ... 3 more
>>> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown output
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:205)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownOutput(
>>> SocketChannelImpl.java:794)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410)
>>>     ... 3 more
>>> 157544 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState
>>> - Connection timed out for connection string (10.11.9.70:2181,
>>> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed
>>> (15006)
>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>>> ConnectionLoss
>>>     at org.apache.curator.ConnectionState.checkTimeouts(
>>> ConnectionState.java:197)
>>>     at org.apache.curator.ConnectionState.getZooKeeper(
>>> ConnectionState.java:88)
>>>     at org.apache.curator.CuratorZookeeperClient.getZooKeeper(
>>> CuratorZookeeperClient.java:116)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> performBackgroundOperation(CuratorFrameworkImpl.java:835)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> access$300(CuratorFrameworkImpl.java:64)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
>>> call(CuratorFrameworkImpl.java:267)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1142)
>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:617)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 158163 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will
>>> not attempt to authenticate using SASL (unknown error)
>>> 158164 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Socket connection established, initiating session, client: /
>>> 10.11.12.4:45070, server: 10.11.9.70/10.11.9.70:2181
>>> 158164 [main-SendThread(10.11.9.70:2181)] DEBUG org.apache.zookeeper.ClientCnxn
>>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181
>>> 158166 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181,
>>> sessionid = 0x10003fa439a0061, negotiated timeout = 40000 (READ-ONLY mode)
>>> 158166 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout
>>> 400
>>> 158167 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: READ_ONLY
>>> 285409 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Unable to read additional data from server sessionid 0x10003fa439a0061,
>>> likely server has closed socket, closing socket connection and attempting
>>> reconnect
>>> 285510 [main] DEBUG org.apache.curator.RetryLoop  - Retry-able exception
>>> received
>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss for /
>>>     at org.apache.zookeeper.KeeperException.create(
>>> KeeperException.java:99)
>>>     at org.apache.zookeeper.KeeperException.create(
>>> KeeperException.java:51)
>>>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>> call(GetDataBuilderImpl.java:310)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>> call(GetDataBuilderImpl.java:299)
>>>     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> pathInForeground(GetDataBuilderImpl.java:295)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> forPath(GetDataBuilderImpl.java:287)
>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>> forPath(GetDataBuilderImpl.java:34)
>>>     at testzkclient.TestCuratorReadOnlyClient.main(
>>> TestCuratorReadOnlyClient.java:25)
>>> 285516 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: SUSPENDED
>>> 286312 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181.
>>> Will not attempt to authenticate using SASL (unknown error)
>>> 290511 [main] DEBUG org.apache.curator.RetryLoop  - Retrying operation
>>> 295516 [Curator-Framework-0] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: LOST
>>> 295517 [Curator-Framework-0] ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl
>>> - Background operation retry gave up
>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>> KeeperErrorCode = ConnectionLoss
>>>     at org.apache.zookeeper.KeeperException.create(
>>> KeeperException.java:99)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> checkBackgroundRetry(CuratorFrameworkImpl.java:728)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> performBackgroundOperation(CuratorFrameworkImpl.java:857)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> access$300(CuratorFrameworkImpl.java:64)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
>>> call(CuratorFrameworkImpl.java:267)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1142)
>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:617)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 295518 [Curator-Framework-0] ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl
>>> - Background retry gave up
>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>>> ConnectionLoss
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> performBackgroundOperation(CuratorFrameworkImpl.java:838)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> access$300(CuratorFrameworkImpl.java:64)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
>>> call(CuratorFrameworkImpl.java:267)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1142)
>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:617)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 299659 [main-SendThread(10.11.157.210:2181)] WARN org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 14149ms for
>>> sessionid 0x10003fa439a0061
>>> 299659 [main-SendThread(10.11.157.210:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Client session timed out, have not heard from server in 14149ms for
>>> sessionid 0x10003fa439a0061, closing socket connection and attempting
>>> reconnect
>>> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown input
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:198)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownInput(
>>> SocketChannelImpl.java:777)
>>>     at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402)
>>>     ... 3 more
>>> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG org.apache.zookeeper.ClientCnxnSocketNIO
>>> - Ignoring exception during shutdown output
>>> java.net.SocketException: Socket is not connected
>>>     at sun.nio.ch.Net.translateToSocketException(Net.java:143)
>>>     at sun.nio.ch.Net.translateException(Net.java:177)
>>>     at sun.nio.ch.Net.translateException(Net.java:183)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412)
>>>     at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(
>>> ClientCnxnSocketNIO.java:205)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(
>>> ClientCnxn.java:1325)
>>>     at org.apache.zookeeper.ClientCnxn$SendThread.run(
>>> ClientCnxn.java:1245)
>>> Caused by: java.nio.channels.NotYetConnectedException
>>>     at sun.nio.ch.SocketChannelImpl.shutdownOutput(
>>> SocketChannelImpl.java:794)
>>>     at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410)
>>>     ... 3 more
>>> 300520 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState
>>> - Connection timed out for connection string (10.11.9.70:2181,
>>> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed
>>> (15004)
>>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode =
>>> ConnectionLoss
>>>     at org.apache.curator.ConnectionState.checkTimeouts(
>>> ConnectionState.java:197)
>>>     at org.apache.curator.ConnectionState.getZooKeeper(
>>> ConnectionState.java:88)
>>>     at org.apache.curator.CuratorZookeeperClient.getZooKeeper(
>>> CuratorZookeeperClient.java:116)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> performBackgroundOperation(CuratorFrameworkImpl.java:835)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl.
>>> access$300(CuratorFrameworkImpl.java:64)
>>>     at org.apache.curator.framework.imps.CuratorFrameworkImpl$4.
>>> call(CuratorFrameworkImpl.java:267)
>>>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>>>     at java.util.concurrent.ScheduledThreadPoolExecutor$
>>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>>>     at java.util.concurrent.ThreadPoolExecutor.runWorker(
>>> ThreadPoolExecutor.java:1142)
>>>     at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>> ThreadPoolExecutor.java:617)
>>>     at java.lang.Thread.run(Thread.java:745)
>>> 301077 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will
>>> not attempt to authenticate using SASL (unknown error)
>>> 301078 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Socket connection established, initiating session, client: /
>>> 10.11.12.4:45385, server: 10.11.9.70/10.11.9.70:2181
>>> 301078 [main-SendThread(10.11.9.70:2181)] DEBUG org.apache.zookeeper.ClientCnxn
>>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181
>>> 301081 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181,
>>> sessionid = 0x10003fa439a0062, negotiated timeout = 40000 (READ-ONLY mode)
>>> 301082 [main-SendThread(10.11.9.70:2181)] INFO org.apache.zookeeper.ClientCnxn
>>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout
>>> 800
>>> 301082 [main-EventThread] INFO org.apache.curator.framework.state.ConnectionStateManager
>>> - State change: READ_ONLY
>>>
>>>
>>> On Tue, Jul 12, 2016 at 4:11 PM, Benjamin Jaton <
>>> benjamin.jaton@gmail.com> wrote:
>>>
>>>> Cameron, you seem to be right, the simple zookeeper client also fails
>>>> sometimes.
>>>> I will bring it up to the ZK team, thank you!
>>>>
>>>> On Tue, Jul 12, 2016 at 3:09 PM, Cameron McKenzie <
>>>> mckenzie.cam@gmail.com> wrote:
>>>>
>>>>> I'm at a bit of a loss then. Looking at the stack trace, it's in the
>>>>> ZK code (not the Curator code) that it appears to be blocking. Maybe try
>>>>> and create a test that removes Curator and just uses the ZK client directly
>>>>> and see if you still have the same problem?
>>>>>
>>>>> On Wed, Jul 13, 2016 at 7:54 AM, Benjamin Jaton <
>>>>> benjamin.jaton@gmail.com> wrote:
>>>>>
>>>>>> Yes, I mean the ping goes through properly
>>>>>>
>>>>>> On Tue, Jul 12, 2016 at 2:50 PM, Cameron McKenzie <
>>>>>> mckenzie.cam@gmail.com> wrote:
>>>>>>
>>>>>>> Are the 'dead' hosts reachable? I seem to recall having problems
>>>>>>> with the ZK client blocking for a long time if you had a host configured
>>>>>>> that was resolvable but not reachable.
>>>>>>>
>>>>>>> On Wed, Jul 13, 2016 at 4:30 AM, Benjamin Jaton <
>>>>>>> benjamin.jaton@gmail.com> wrote:
>>>>>>>
>>>>>>>> More to the story:
>>>>>>>>
>>>>>>>> While it consistently works on the same machine than the sole
>>>>>>>> running ZK node when I specify only 2 of the 3 servers in the connection
>>>>>>>> string,
>>>>>>>> it doesn't work when I specify 3 hosts:
>>>>>>>>
>>>>>>>>
>>>>>>>> 2016-07-12 11:23:46 INFO  o.a.z.ZooKeeper:716 - Initiating client
>>>>>>>> connection, connectString=QA-E8WIN11:2181,
>>>>>>>> QA-E8WIN12:2181,QA-E8WIN13:2181 sessionTimeout=45000
>>>>>>>> watcher=org.apache.curator.ConnectionState@c86b9e3
>>>>>>>> 2016-07-12 11:23:46 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>> connection to server QA-E8WIN13/10.11.8.230:2181. Will not attempt
>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>> 2016-07-12 11:24:01 INFO  o.a.z.ClientCnxn:1251 - Client session
>>>>>>>> timed out, have not heard from server in 15018ms for sessionid 0x0, closing
>>>>>>>> socket connection and attempting reconnect
>>>>>>>> Successfully established the connection with ZooKeeper
>>>>>>>> 2016-07-12 11:24:01 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not attempt
>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.z.ClientCnxn:1251 - Client session
>>>>>>>> timed out, have not heard from server in 15000ms for sessionid 0x0, closing
>>>>>>>> socket connection and attempting reconnect
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt
>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.z.ClientCnxn:980 - Socket connection
>>>>>>>> established, initiating session, client: /10.11.8.236:50423,
>>>>>>>> server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.z.ClientCnxn:1400 - Session
>>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>>>>> sessionid = 0x10003c8af64001b, negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 200
>>>>>>>> 2016-07-12 11:24:16 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: READ_ONLY
>>>>>>>> 2016-07-12 11:24:37 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400
>>>>>>>> 2016-07-12 11:24:58 INFO  o.a.z.ClientCnxn:1251 - Client session
>>>>>>>> timed out, have not heard from server in 42023ms for sessionid
>>>>>>>> 0x10003c8af64001b, closing socket connection and attempting reconnect
>>>>>>>> 2016-07-12 11:24:58 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: SUSPENDED
>>>>>>>> 2016-07-12 11:25:00 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt
>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>> 2016-07-12 11:25:00 INFO  o.a.z.ClientCnxn:980 - Socket connection
>>>>>>>> established, initiating session, client: /10.11.8.236:50428,
>>>>>>>> server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>> 2016-07-12 11:25:00 INFO  o.a.z.ClientCnxn:1400 - Session
>>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>>>>> sessionid = 0x10003c8af64001c, negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>> 2016-07-12 11:25:00 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 800
>>>>>>>> 2016-07-12 11:25:00 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: READ_ONLY
>>>>>>>> 2016-07-12 11:25:21 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 1600
>>>>>>>> 2016-07-12 11:25:42 INFO  o.a.z.ClientCnxn:1251 - Client session
>>>>>>>> timed out, have not heard from server in 42011ms for sessionid
>>>>>>>> 0x10003c8af64001c, closing socket connection and attempting reconnect
>>>>>>>> 2016-07-12 11:25:42 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: SUSPENDED
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt
>>>>>>>> to authenticate using SASL (unknown error)
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.z.ClientCnxn:980 - Socket connection
>>>>>>>> established, initiating session, client: /10.11.8.236:50435,
>>>>>>>> server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.z.ClientCnxn:1400 - Session
>>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>>>>> sessionid = 0x10003c8af64001d, negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: READ_ONLY
>>>>>>>> 2016-07-12 11:26:05 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 6400
>>>>>>>> 2016-07-12 11:26:26 INFO  o.a.z.ClientCnxn:1251 - Client session
>>>>>>>> timed out, have not heard from server in 42003ms for sessionid
>>>>>>>> 0x10003c8af64001d, closing socket connection and attempting reconnect
>>>>>>>> 2016-07-12 11:26:26 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>> State change: SUSPENDED
>>>>>>>> Exception in thread "main" org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>> KeeperErrorCode = ConnectionLoss for /
>>>>>>>>     at org.apache.zookeeper.KeeperException.create(
>>>>>>>> KeeperException.java:99)
>>>>>>>>     at org.apache.zookeeper.KeeperException.create(
>>>>>>>> KeeperException.java:51)
>>>>>>>>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753)
>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>>>>>>> call(GetDataBuilderImpl.java:310)
>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>>>>>>> call(GetDataBuilderImpl.java:299)
>>>>>>>>     at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:
>>>>>>>> 108)
>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>> pathInForeground(GetDataBuilderImpl.java:295)
>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>> forPath(GetDataBuilderImpl.java:287)
>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>> forPath(GetDataBuilderImpl.java:34)
>>>>>>>>     at test.Test.main(Test.java:25)
>>>>>>>>
>>>>>>>>
>>>>>>>> That seems to suggest that adding 1 host to check somehow messes up
>>>>>>>> something in the timeouts.
>>>>>>>> I am suspicious of those:
>>>>>>>> 2016-07-12 11:25:44 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200
>>>>>>>> They seem to be "blocking" the existing connection we have to the
>>>>>>>> healthy ZK server, which goes stale. Is that possible?
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jul 12, 2016 at 11:17 AM, Benjamin Jaton <
>>>>>>>> bjaton@radiantlogic.com> wrote:
>>>>>>>>
>>>>>>>>> I did not use the zkCli tool but the ZK java API directly. And
>>>>>>>>> other tools like ZooInspector (contrib). They were run from a remote
>>>>>>>>> machine.
>>>>>>>>> I ran it on the same machine and the one read only server is
>>>>>>>>> running properly, I can browse it using those methods.
>>>>>>>>>
>>>>>>>>> Trying to make sense of those client-server interactions is
>>>>>>>>> difficult but here is what I got:
>>>>>>>>>
>>>>>>>>> Client:
>>>>>>>>> 2016-07-12 10:55:01,508 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>> - *Opening socket connection to server QA-E8WIN11*/
>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL
>>>>>>>>> (unknown error)
>>>>>>>>> 2016-07-12 10:55:01,512 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>> - *Socket connection established*, initiating session, client: /
>>>>>>>>> 10.11.12.4:60586, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>> 2016-07-12 10:55:01,521 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>> - *Session establishment complete on server QA-E8WIN11*/
>>>>>>>>> 10.11.8.236:2181, *sessionid = 0x10003c8af640007*, negotiated
>>>>>>>>> timeout = 45000 (READ-ONLY mode)
>>>>>>>>> 2016-07-12 10:55:01,523 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w.
>>>>>>>>> Timeout 200
>>>>>>>>> 2016-07-12 10:55:01,528 INFO  org.apache.curator.framework.
>>>>>>>>> state.ConnectionStateManager:228 - *State change: READ_ONLY*
>>>>>>>>> 2016-07-12 10:57:08,800 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>> - *Unable to read additional data from server sessionid
>>>>>>>>> 0x10003c8af640007, likely server has closed socket, closing socket
>>>>>>>>> connection and attempting reconnect*
>>>>>>>>> 2016-07-12 10:57:08,903 INFO  org.apache.curator.framework.
>>>>>>>>> state.ConnectionStateManager:228 - *State change: SUSPENDED*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Server:
>>>>>>>>> 2016-07-12T10:55:56,468 [myid:] - INFO  [ProcessThread(sid:1
>>>>>>>>> cport:-1)::ZooKeeperServer@678] - *Established session
>>>>>>>>> 0x10003c8af640007 with negotiated timeout 45000* for client /
>>>>>>>>> 10.11.12.4:60586
>>>>>>>>> (...)
>>>>>>>>> 2016-07-12T10:56:41,756 [myid:] - INFO  [SessionTracker:
>>>>>>>>> ZooKeeperServer@384] - *Expiring session 0x10003c8af640007,
>>>>>>>>> timeout of 45000ms exceeded*
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Note: Don't worry about the time being about 1 min off between the
>>>>>>>>> machines.
>>>>>>>>>
>>>>>>>>> What I don't understand is why we are losing the read-only
>>>>>>>>> connection we successfully established. ZK ro server says it timed out?
>>>>>>>>>
>>>>>>>>> Now interestingly, when I run it on the same machine it works and
>>>>>>>>> completes pretty quickly:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2016-07-12 10:43:49 INFO  o.a.z.ZooKeeper:716 - Initiating client
>>>>>>>>> connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181
>>>>>>>>> sessionTimeout=45000 watcher=org.apache.curator.
>>>>>>>>> ConnectionState@e4487af
>>>>>>>>> 2016-07-12 10:43:49 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not
>>>>>>>>> attempt to authenticate using SASL (unknown error)
>>>>>>>>> Successfully established the connection with ZooKeeper
>>>>>>>>> 2016-07-12 10:44:10 WARN  o.a.z.ClientCnxn:1257 - Session 0x0 for
>>>>>>>>> server null, unexpected error, closing socket connection and attempting
>>>>>>>>> reconnect
>>>>>>>>> java.net.ConnectException: Connection timed out: no further
>>>>>>>>> information
>>>>>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>>>> ~[?:1.8.0_91]
>>>>>>>>> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>>>>>>>>> ~[?:1.8.0_91]
>>>>>>>>> at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
>>>>>>>>> ~[zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518]
>>>>>>>>> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236)
>>>>>>>>> [zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518]
>>>>>>>>> 2016-07-12 10:44:11 INFO  o.a.z.ClientCnxn:1138 - Opening socket
>>>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not
>>>>>>>>> attempt to authenticate using SASL (unknown error)
>>>>>>>>> 2016-07-12 10:44:11 INFO  o.a.z.ClientCnxn:980 - Socket connection
>>>>>>>>> established, initiating session, client: /10.11.8.236:50147,
>>>>>>>>> server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>> 2016-07-12 10:44:11 INFO  o.a.z.ClientCnxn:1400 - Session
>>>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>>>>>> sessionid = 0x10003b315b30012, negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>>> 2016-07-12 10:44:11 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200
>>>>>>>>> 2016-07-12 10:44:11 INFO  o.a.c.f.s.ConnectionStateManager:228 -
>>>>>>>>> State change: READ_ONLY
>>>>>>>>> 2016-07-12 10:44:32 INFO  o.a.z.ClientCnxn:1297 - Checking server
>>>>>>>>> QA-E8WIN11/10.11.8.236:2181 for being r/w. Timeout 400
>>>>>>>>> Done.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jul 11, 2016 at 4:42 PM, Cameron McKenzie <
>>>>>>>>> mckenzie.cam@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> It looks like a connection issue based on the logs. When you
>>>>>>>>>> confirmed that the connection was ok via the zkCli, was that running on the
>>>>>>>>>> same host as the read only server? Can you run it from the same host that
>>>>>>>>>> you're running the Java client from?
>>>>>>>>>>
>>>>>>>>>> On Tue, Jul 12, 2016 at 9:40 AM, Benjamin Jaton <
>>>>>>>>>> benjamin.jaton@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Well I was trying to let it run and it eventually fails, with a
>>>>>>>>>>> stack:
>>>>>>>>>>>
>>>>>>>>>>> Successfully established the connection with ZooKeeper
>>>>>>>>>>> 2016-07-11 16:28:21,943 INFO  org.apache.zookeeper.ClientCnxn:1251
>>>>>>>>>>> - Client session timed out, have not heard from server in 22519ms for
>>>>>>>>>>> sessionid 0x0, closing socket connection and attempting reconnect
>>>>>>>>>>> 2016-07-11 16:28:22,046 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL
>>>>>>>>>>> (unknown error)
>>>>>>>>>>> 2016-07-11 16:28:22,050 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>> 10.11.12.4:54702, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>> 2016-07-11 16:28:22,058 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360080, negotiated
>>>>>>>>>>> timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>> 2016-07-11 16:28:22,061 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w.
>>>>>>>>>>> Timeout 200
>>>>>>>>>>> 2016-07-11 16:28:22,066 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>> 2016-07-11 16:30:29,312 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f09360080,
>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>> reconnect
>>>>>>>>>>> 2016-07-11 16:30:29,416 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED
>>>>>>>>>>> 2016-07-11 16:30:31,031 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL
>>>>>>>>>>> (unknown error)
>>>>>>>>>>> 2016-07-11 16:30:31,032 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>> 10.11.12.4:55156, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>> 2016-07-11 16:30:31,034 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360083, negotiated
>>>>>>>>>>> timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>> 2016-07-11 16:30:31,034 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w.
>>>>>>>>>>> Timeout 400
>>>>>>>>>>> 2016-07-11 16:30:31,035 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>> 2016-07-11 16:32:38,335 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f09360083,
>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>> reconnect
>>>>>>>>>>> 2016-07-11 16:32:38,441 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED
>>>>>>>>>>> 2016-07-11 16:32:40,425 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL
>>>>>>>>>>> (unknown error)
>>>>>>>>>>> 2016-07-11 16:32:40,426 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>> 10.11.12.4:55637, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>> 2016-07-11 16:32:40,428 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360087, negotiated
>>>>>>>>>>> timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>> 2016-07-11 16:32:40,428 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w.
>>>>>>>>>>> Timeout 800
>>>>>>>>>>> 2016-07-11 16:32:40,428 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>> 2016-07-11 16:34:47,615 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f09360087,
>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>> reconnect
>>>>>>>>>>> 2016-07-11 16:34:47,716 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED
>>>>>>>>>>> 2016-07-11 16:34:49,557 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL
>>>>>>>>>>> (unknown error)
>>>>>>>>>>> 2016-07-11 16:34:49,558 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>> 10.11.12.4:56094, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>> 2016-07-11 16:34:49,560 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360089, negotiated
>>>>>>>>>>> timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>> 2016-07-11 16:34:49,560 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w.
>>>>>>>>>>> Timeout 1600
>>>>>>>>>>> 2016-07-11 16:34:49,560 INFO  org.apache.curator.framework.
>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>> 2016-07-11 16:36:56,895 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f09360089,
>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>> reconnect
>>>>>>>>>>> Exception in thread "main" 2016-07-11 16:36:56,996 INFO
>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 -
>>>>>>>>>>> State change: SUSPENDED
>>>>>>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException:
>>>>>>>>>>> KeeperErrorCode = ConnectionLoss for /
>>>>>>>>>>>     at org.apache.zookeeper.KeeperException.create(
>>>>>>>>>>> KeeperException.java:99)
>>>>>>>>>>>     at org.apache.zookeeper.KeeperException.create(
>>>>>>>>>>> KeeperException.java:51)
>>>>>>>>>>>     at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.
>>>>>>>>>>> java:1753)
>>>>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>>>>>>>>>> call(GetDataBuilderImpl.java:310)
>>>>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl$4.
>>>>>>>>>>> call(GetDataBuilderImpl.java:299)
>>>>>>>>>>>     at org.apache.curator.RetryLoop.
>>>>>>>>>>> callWithRetry(RetryLoop.java:108)
>>>>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>>>>> pathInForeground(GetDataBuilderImpl.java:295)
>>>>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>>>>> forPath(GetDataBuilderImpl.java:287)
>>>>>>>>>>>     at org.apache.curator.framework.imps.GetDataBuilderImpl.
>>>>>>>>>>> forPath(GetDataBuilderImpl.java:34)
>>>>>>>>>>>     at TestCuratorReadOnly.main(TestCuratorReadOnly.java:23)
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:39 PM, Cameron McKenzie <
>>>>>>>>>>> mckenzie.cam@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> So, it appears that the initial connection occurs ok, but then
>>>>>>>>>>>> something goes wrong
>>>>>>>>>>>>
>>>>>>>>>>>> 2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f0936005b,
>>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>>> reconnect
>>>>>>>>>>>>
>>>>>>>>>>>> Then a reconnection occurs ok, and then the cycle starts again.
>>>>>>>>>>>>
>>>>>>>>>>>> Can you see where in the stack the getData() call is hanging?
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Jul 12, 2016 at 9:35 AM, Benjamin Jaton <
>>>>>>>>>>>> benjamin.jaton@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The blockUntilConnectedOrTimedOut() does return, we see the
>>>>>>>>>>>>> line:
>>>>>>>>>>>>> "Successfully established the connection with ZooKeeper"
>>>>>>>>>>>>> which happens right after.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The getData() is where it hangs.
>>>>>>>>>>>>>
>>>>>>>>>>>>> We see this first:
>>>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO  org.apache.curator.framework.
>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>
>>>>>>>>>>>>> and then we see it again 2 minutes later:
>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.curator.framework.
>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>
>>>>>>>>>>>>> and then again 2 minutes after that:
>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.curator.framework.
>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then I stopped it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie <
>>>>>>>>>>>>> mckenzie.cam@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> hey Ben,
>>>>>>>>>>>>>> From the logs it looks like the connection is established at
>>>>>>>>>>>>>> the end?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Are you saying that the blockUntilConnectedOrTimedOut() does
>>>>>>>>>>>>>> not return when this occurs?
>>>>>>>>>>>>>> cheers
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton <
>>>>>>>>>>>>>> bjaton@radiantlogic.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> ZK is 3.5.1-alpha
>>>>>>>>>>>>>>> Curator is 2.11.0
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton <
>>>>>>>>>>>>>>> bjaton@radiantlogic.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have a 3 nodes ZK cluster , readonly mode is enabled.
>>>>>>>>>>>>>>>> 2 nodes are down, so the third one is in read-only
>>>>>>>>>>>>>>>> (verified by using the ZK API manually).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm using this piece of code:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Builder curatorClientBuilder = CuratorFrameworkFactory.
>>>>>>>>>>>>>>>> builder()
>>>>>>>>>>>>>>>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>>>>>>>>>>>>>>>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>>>>>>>>>>>>>>>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> CuratorFramework client = curatorClientBuilder.build();
>>>>>>>>>>>>>>>> client.start();
>>>>>>>>>>>>>>>> client.getZookeeperClient().blockUntilConnectedOrTimedOut(
>>>>>>>>>>>>>>>> );
>>>>>>>>>>>>>>>> System.out.println("Successfully established the
>>>>>>>>>>>>>>>> connection with ZooKeeper");
>>>>>>>>>>>>>>>> client.getData().forPath("/");
>>>>>>>>>>>>>>>> System.out.println("Done.");
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When curator pick the host that is UP first, it goes
>>>>>>>>>>>>>>>> through very quickly. When it picks the host that is down first
>>>>>>>>>>>>>>>> (QA-E8WIN12), it never goes through.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Am I missing something here?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Below is the log:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,733 INFO  org.apache.curator.framework.imps.CuratorFrameworkImpl:235
>>>>>>>>>>>>>>>> - Starting
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:zookeeper.version=3.5.1-alpha-1748895,
>>>>>>>>>>>>>>>> built on 06/17/2016 22:24 GMT
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:host.name=w-rli09-ben
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.version=1.8.0_11
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.vendor=Oracle Corporation
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.home=/usr/
>>>>>>>>>>>>>>>> local/apps/jdk1.8.0_11/jre
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.class.path=(..)
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.io.tmpdir=/tmp
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:java.compiler=<NA>
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.name=Linux
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.arch=amd64
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.version=3.8.0-44-generic
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:user.name=benji
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:user.home=(..)
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:user.dir=(..)
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.memory.free=385MB
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.memory.max=5358MB
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109
>>>>>>>>>>>>>>>> - Client environment:os.memory.total=397MB
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,750 INFO  org.apache.zookeeper.ZooKeeper:716
>>>>>>>>>>>>>>>> - Initiating client connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181
>>>>>>>>>>>>>>>> sessionTimeout=45000 watcher=org.apache.curator.
>>>>>>>>>>>>>>>> ConnectionState@3e96bacf
>>>>>>>>>>>>>>>> 2016-07-11 15:46:45,769 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN12/
>>>>>>>>>>>>>>>> 10.11.8.232:2181. Will not attempt to authenticate using
>>>>>>>>>>>>>>>> SASL (unknown error)
>>>>>>>>>>>>>>>> Successfully established the connection with ZooKeeper
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,300 INFO  org.apache.zookeeper.ClientCnxn:1251
>>>>>>>>>>>>>>>> - Client session timed out, have not heard from server in 22534ms for
>>>>>>>>>>>>>>>> sessionid 0x0, closing socket connection and attempting reconnect
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,403 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using
>>>>>>>>>>>>>>>> SASL (unknown error)
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,406 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>>>>>>> 10.11.12.4:47004, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,413 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f0936005b,
>>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,415 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being
>>>>>>>>>>>>>>>> r/w. Timeout 200
>>>>>>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO  org.apache.curator.framework.
>>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>>>> 2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f0936005b,
>>>>>>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>>>>>>> reconnect
>>>>>>>>>>>>>>>> 2016-07-11 15:49:15,815 INFO  org.apache.curator.framework.
>>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED
>>>>>>>>>>>>>>>> 2016-07-11 15:49:17,313 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using
>>>>>>>>>>>>>>>> SASL (unknown error)
>>>>>>>>>>>>>>>> 2016-07-11 15:49:17,314 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>>>>>>> 10.11.12.4:47295, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f0936005f,
>>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being
>>>>>>>>>>>>>>>> r/w. Timeout 400
>>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.curator.framework.
>>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>>>> 2016-07-11 15:51:24,607 INFO  org.apache.zookeeper.ClientCnxn:1253
>>>>>>>>>>>>>>>> - Unable to read additional data from server sessionid 0x100024f0936005f,
>>>>>>>>>>>>>>>> likely server has closed socket, closing socket connection and attempting
>>>>>>>>>>>>>>>> reconnect
>>>>>>>>>>>>>>>> 2016-07-11 15:51:24,713 INFO  org.apache.curator.framework.
>>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED
>>>>>>>>>>>>>>>> 2016-07-11 15:51:26,467 INFO  org.apache.zookeeper.ClientCnxn:1138
>>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using
>>>>>>>>>>>>>>>> SASL (unknown error)
>>>>>>>>>>>>>>>> 2016-07-11 15:51:26,468 INFO  org.apache.zookeeper.ClientCnxn:980
>>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: /
>>>>>>>>>>>>>>>> 10.11.12.4:47576, server: QA-E8WIN11/10.11.8.236:2181
>>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1400
>>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/
>>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360061,
>>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode)
>>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1297
>>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being
>>>>>>>>>>>>>>>> r/w. Timeout 800
>>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.curator.framework.
>>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY
>>>>>>>>>>>>>>>> (...)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>> Ben
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
>

Mime
View raw message