curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cameron McKenzie <mckenzie....@gmail.com>
Subject Re: Curator never goes through when ZK is read-only
Date Mon, 11 Jul 2016 23:39:28 GMT
So, it appears that the initial connection occurs ok, but then something
goes wrong

2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253 - Unable
to read additional data from server sessionid 0x100024f0936005b, likely
server has closed socket, closing socket connection and attempting reconnect

Then a reconnection occurs ok, and then the cycle starts again.

Can you see where in the stack the getData() call is hanging?

On Tue, Jul 12, 2016 at 9:35 AM, Benjamin Jaton <benjamin.jaton@gmail.com>
wrote:

> The blockUntilConnectedOrTimedOut() does return, we see the line:
> "Successfully established the connection with ZooKeeper"
> which happens right after.
>
> The getData() is where it hangs.
>
> We see this first:
> 2016-07-11 15:47:08,420 INFO
> org.apache.curator.framework.state.ConnectionStateManager:228 - State
> change: READ_ONLY
>
> and then we see it again 2 minutes later:
> 2016-07-11 15:49:17,316 INFO
> org.apache.curator.framework.state.ConnectionStateManager:228 - State
> change: READ_ONLY
>
> and then again 2 minutes after that:
> 2016-07-11 15:51:26,470 INFO
> org.apache.curator.framework.state.ConnectionStateManager:228 - State
> change: READ_ONLY
>
> Then I stopped it.
>
> On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
> wrote:
>
>> hey Ben,
>> From the logs it looks like the connection is established at the end?
>>
>> Are you saying that the blockUntilConnectedOrTimedOut() does not return
>> when this occurs?
>> cheers
>>
>>
>> On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton <bjaton@radiantlogic.com>
>> wrote:
>>
>>> ZK is 3.5.1-alpha
>>> Curator is 2.11.0
>>>
>>> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton <bjaton@radiantlogic.com
>>> > wrote:
>>>
>>>> I have a 3 nodes ZK cluster , readonly mode is enabled.
>>>> 2 nodes are down, so the third one is in read-only (verified by using
>>>> the ZK API manually).
>>>>
>>>> I'm using this piece of code:
>>>>
>>>> Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>>>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>>>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>>>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true);
>>>>
>>>> CuratorFramework client = curatorClientBuilder.build();
>>>> client.start();
>>>> client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>>>> System.out.println("Successfully established the connection with
>>>> ZooKeeper");
>>>> client.getData().forPath("/");
>>>> System.out.println("Done.");
>>>>
>>>> When curator pick the host that is UP first, it goes through very
>>>> quickly. When it picks the host that is down first (QA-E8WIN12), it never
>>>> goes through.
>>>>
>>>> Am I missing something here?
>>>>
>>>> Below is the log:
>>>>
>>>> 2016-07-11 15:46:45,733 INFO
>>>>  org.apache.curator.framework.imps.CuratorFrameworkImpl:235 - Starting
>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:zookeeper.version=3.5.1-alpha-1748895, built on
>>>> 06/17/2016 22:24 GMT
>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:host.name=w-rli09-ben
>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.version=1.8.0_11
>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.vendor=Oracle Corporation
>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.home=/usr/local/apps/jdk1.8.0_11/jre
>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.class.path=(..)
>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.io.tmpdir=/tmp
>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:java.compiler=<NA>
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.name=Linux
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.arch=amd64
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.version=3.8.0-44-generic
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:user.name=benji
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:user.home=(..)
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:user.dir=(..)
>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.memory.free=385MB
>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.memory.max=5358MB
>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>> Client environment:os.memory.total=397MB
>>>> 2016-07-11 15:46:45,750 INFO  org.apache.zookeeper.ZooKeeper:716 -
>>>> Initiating client connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181
>>>> sessionTimeout=45000 watcher=org.apache.curator.ConnectionState@3e96bacf
>>>> 2016-07-11 15:46:45,769 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>> Opening socket connection to server QA-E8WIN12/10.11.8.232:2181. Will
>>>> not attempt to authenticate using SASL (unknown error)
>>>> Successfully established the connection with ZooKeeper
>>>> 2016-07-11 15:47:08,300 INFO  org.apache.zookeeper.ClientCnxn:1251 -
>>>> Client session timed out, have not heard from server in 22534ms for
>>>> sessionid 0x0, closing socket connection and attempting reconnect
>>>> 2016-07-11 15:47:08,403 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>> not attempt to authenticate using SASL (unknown error)
>>>> 2016-07-11 15:47:08,406 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>> Socket connection established, initiating session, client: /
>>>> 10.11.12.4:47004, server: QA-E8WIN11/10.11.8.236:2181
>>>> 2016-07-11 15:47:08,413 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>> sessionid = 0x100024f0936005b, negotiated timeout = 45000 (READ-ONLY mode)
>>>> 2016-07-11 15:47:08,415 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200
>>>> 2016-07-11 15:47:08,420 INFO
>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>> change: READ_ONLY
>>>> 2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253 -
>>>> Unable to read additional data from server sessionid 0x100024f0936005b,
>>>> likely server has closed socket, closing socket connection and attempting
>>>> reconnect
>>>> 2016-07-11 15:49:15,815 INFO
>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>> change: SUSPENDED
>>>> 2016-07-11 15:49:17,313 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>> not attempt to authenticate using SASL (unknown error)
>>>> 2016-07-11 15:49:17,314 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>> Socket connection established, initiating session, client: /
>>>> 10.11.12.4:47295, server: QA-E8WIN11/10.11.8.236:2181
>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>> sessionid = 0x100024f0936005f, negotiated timeout = 45000 (READ-ONLY mode)
>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400
>>>> 2016-07-11 15:49:17,316 INFO
>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>> change: READ_ONLY
>>>> 2016-07-11 15:51:24,607 INFO  org.apache.zookeeper.ClientCnxn:1253 -
>>>> Unable to read additional data from server sessionid 0x100024f0936005f,
>>>> likely server has closed socket, closing socket connection and attempting
>>>> reconnect
>>>> 2016-07-11 15:51:24,713 INFO
>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>> change: SUSPENDED
>>>> 2016-07-11 15:51:26,467 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>> not attempt to authenticate using SASL (unknown error)
>>>> 2016-07-11 15:51:26,468 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>> Socket connection established, initiating session, client: /
>>>> 10.11.12.4:47576, server: QA-E8WIN11/10.11.8.236:2181
>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>> sessionid = 0x100024f09360061, negotiated timeout = 45000 (READ-ONLY mode)
>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 800
>>>> 2016-07-11 15:51:26,470 INFO
>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>> change: READ_ONLY
>>>> (...)
>>>>
>>>> Thanks
>>>> Ben
>>>>
>>>>
>>>
>>
>

Mime
View raw message