curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benjamin Jaton <benjamin.ja...@gmail.com>
Subject Re: Curator never goes through when ZK is read-only
Date Mon, 11 Jul 2016 23:40:53 GMT
Well I was trying to let it run and it eventually fails, with a stack:

Successfully established the connection with ZooKeeper
2016-07-11 16:28:21,943 INFO  org.apache.zookeeper.ClientCnxn:1251 - Client
session timed out, have not heard from server in 22519ms for sessionid 0x0,
closing socket connection and attempting reconnect
2016-07-11 16:28:22,046 INFO  org.apache.zookeeper.ClientCnxn:1138 -
Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will not
attempt to authenticate using SASL (unknown error)
2016-07-11 16:28:22,050 INFO  org.apache.zookeeper.ClientCnxn:980 - Socket
connection established, initiating session, client: /10.11.12.4:54702,
server: QA-E8WIN11/10.11.8.236:2181
2016-07-11 16:28:22,058 INFO  org.apache.zookeeper.ClientCnxn:1400 -
Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
sessionid = 0x100024f09360080, negotiated timeout = 45000 (READ-ONLY mode)
2016-07-11 16:28:22,061 INFO  org.apache.zookeeper.ClientCnxn:1297 -
Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200
2016-07-11 16:28:22,066 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: READ_ONLY
2016-07-11 16:30:29,312 INFO  org.apache.zookeeper.ClientCnxn:1253 - Unable
to read additional data from server sessionid 0x100024f09360080, likely
server has closed socket, closing socket connection and attempting reconnect
2016-07-11 16:30:29,416 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: SUSPENDED
2016-07-11 16:30:31,031 INFO  org.apache.zookeeper.ClientCnxn:1138 -
Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will not
attempt to authenticate using SASL (unknown error)
2016-07-11 16:30:31,032 INFO  org.apache.zookeeper.ClientCnxn:980 - Socket
connection established, initiating session, client: /10.11.12.4:55156,
server: QA-E8WIN11/10.11.8.236:2181
2016-07-11 16:30:31,034 INFO  org.apache.zookeeper.ClientCnxn:1400 -
Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
sessionid = 0x100024f09360083, negotiated timeout = 45000 (READ-ONLY mode)
2016-07-11 16:30:31,034 INFO  org.apache.zookeeper.ClientCnxn:1297 -
Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400
2016-07-11 16:30:31,035 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: READ_ONLY
2016-07-11 16:32:38,335 INFO  org.apache.zookeeper.ClientCnxn:1253 - Unable
to read additional data from server sessionid 0x100024f09360083, likely
server has closed socket, closing socket connection and attempting reconnect
2016-07-11 16:32:38,441 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: SUSPENDED
2016-07-11 16:32:40,425 INFO  org.apache.zookeeper.ClientCnxn:1138 -
Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will not
attempt to authenticate using SASL (unknown error)
2016-07-11 16:32:40,426 INFO  org.apache.zookeeper.ClientCnxn:980 - Socket
connection established, initiating session, client: /10.11.12.4:55637,
server: QA-E8WIN11/10.11.8.236:2181
2016-07-11 16:32:40,428 INFO  org.apache.zookeeper.ClientCnxn:1400 -
Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
sessionid = 0x100024f09360087, negotiated timeout = 45000 (READ-ONLY mode)
2016-07-11 16:32:40,428 INFO  org.apache.zookeeper.ClientCnxn:1297 -
Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 800
2016-07-11 16:32:40,428 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: READ_ONLY
2016-07-11 16:34:47,615 INFO  org.apache.zookeeper.ClientCnxn:1253 - Unable
to read additional data from server sessionid 0x100024f09360087, likely
server has closed socket, closing socket connection and attempting reconnect
2016-07-11 16:34:47,716 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: SUSPENDED
2016-07-11 16:34:49,557 INFO  org.apache.zookeeper.ClientCnxn:1138 -
Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will not
attempt to authenticate using SASL (unknown error)
2016-07-11 16:34:49,558 INFO  org.apache.zookeeper.ClientCnxn:980 - Socket
connection established, initiating session, client: /10.11.12.4:56094,
server: QA-E8WIN11/10.11.8.236:2181
2016-07-11 16:34:49,560 INFO  org.apache.zookeeper.ClientCnxn:1400 -
Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
sessionid = 0x100024f09360089, negotiated timeout = 45000 (READ-ONLY mode)
2016-07-11 16:34:49,560 INFO  org.apache.zookeeper.ClientCnxn:1297 -
Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 1600
2016-07-11 16:34:49,560 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: READ_ONLY
2016-07-11 16:36:56,895 INFO  org.apache.zookeeper.ClientCnxn:1253 - Unable
to read additional data from server sessionid 0x100024f09360089, likely
server has closed socket, closing socket connection and attempting reconnect
Exception in thread "main" 2016-07-11 16:36:56,996 INFO
org.apache.curator.framework.state.ConnectionStateManager:228 - State
change: SUSPENDED
org.apache.zookeeper.KeeperException$ConnectionLossException:
KeeperErrorCode = ConnectionLoss for /
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
    at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
    at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753)
    at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310)
    at
org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299)
    at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108)
    at
org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295)
    at
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287)
    at
org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34)
    at TestCuratorReadOnly.main(TestCuratorReadOnly.java:23)

On Mon, Jul 11, 2016 at 4:39 PM, Cameron McKenzie <mckenzie.cam@gmail.com>
wrote:

> So, it appears that the initial connection occurs ok, but then something
> goes wrong
>
> 2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253 -
> Unable to read additional data from server sessionid 0x100024f0936005b,
> likely server has closed socket, closing socket connection and attempting
> reconnect
>
> Then a reconnection occurs ok, and then the cycle starts again.
>
> Can you see where in the stack the getData() call is hanging?
>
> On Tue, Jul 12, 2016 at 9:35 AM, Benjamin Jaton <benjamin.jaton@gmail.com>
> wrote:
>
>> The blockUntilConnectedOrTimedOut() does return, we see the line:
>> "Successfully established the connection with ZooKeeper"
>> which happens right after.
>>
>> The getData() is where it hangs.
>>
>> We see this first:
>> 2016-07-11 15:47:08,420 INFO
>> org.apache.curator.framework.state.ConnectionStateManager:228 - State
>> change: READ_ONLY
>>
>> and then we see it again 2 minutes later:
>> 2016-07-11 15:49:17,316 INFO
>> org.apache.curator.framework.state.ConnectionStateManager:228 - State
>> change: READ_ONLY
>>
>> and then again 2 minutes after that:
>> 2016-07-11 15:51:26,470 INFO
>> org.apache.curator.framework.state.ConnectionStateManager:228 - State
>> change: READ_ONLY
>>
>> Then I stopped it.
>>
>> On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie <mckenzie.cam@gmail.com
>> > wrote:
>>
>>> hey Ben,
>>> From the logs it looks like the connection is established at the end?
>>>
>>> Are you saying that the blockUntilConnectedOrTimedOut() does not return
>>> when this occurs?
>>> cheers
>>>
>>>
>>> On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton <bjaton@radiantlogic.com
>>> > wrote:
>>>
>>>> ZK is 3.5.1-alpha
>>>> Curator is 2.11.0
>>>>
>>>> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton <
>>>> bjaton@radiantlogic.com> wrote:
>>>>
>>>>> I have a 3 nodes ZK cluster , readonly mode is enabled.
>>>>> 2 nodes are down, so the third one is in read-only (verified by using
>>>>> the ZK API manually).
>>>>>
>>>>> I'm using this piece of code:
>>>>>
>>>>> Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
>>>>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
>>>>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000)
>>>>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true);
>>>>>
>>>>> CuratorFramework client = curatorClientBuilder.build();
>>>>> client.start();
>>>>> client.getZookeeperClient().blockUntilConnectedOrTimedOut();
>>>>> System.out.println("Successfully established the connection with
>>>>> ZooKeeper");
>>>>> client.getData().forPath("/");
>>>>> System.out.println("Done.");
>>>>>
>>>>> When curator pick the host that is UP first, it goes through very
>>>>> quickly. When it picks the host that is down first (QA-E8WIN12), it never
>>>>> goes through.
>>>>>
>>>>> Am I missing something here?
>>>>>
>>>>> Below is the log:
>>>>>
>>>>> 2016-07-11 15:46:45,733 INFO
>>>>>  org.apache.curator.framework.imps.CuratorFrameworkImpl:235 - Starting
>>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:zookeeper.version=3.5.1-alpha-1748895, built on
>>>>> 06/17/2016 22:24 GMT
>>>>> 2016-07-11 15:46:45,743 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:host.name=w-rli09-ben
>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.version=1.8.0_11
>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.vendor=Oracle Corporation
>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.home=/usr/local/apps/jdk1.8.0_11/jre
>>>>> 2016-07-11 15:46:45,744 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.class.path=(..)
>>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.io.tmpdir=/tmp
>>>>> 2016-07-11 15:46:45,746 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:java.compiler=<NA>
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.name=Linux
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.arch=amd64
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.version=3.8.0-44-generic
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:user.name=benji
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:user.home=(..)
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:user.dir=(..)
>>>>> 2016-07-11 15:46:45,747 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.memory.free=385MB
>>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.memory.max=5358MB
>>>>> 2016-07-11 15:46:45,748 INFO  org.apache.zookeeper.ZooKeeper:109 -
>>>>> Client environment:os.memory.total=397MB
>>>>> 2016-07-11 15:46:45,750 INFO  org.apache.zookeeper.ZooKeeper:716 -
>>>>> Initiating client connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181
>>>>> sessionTimeout=45000 watcher=org.apache.curator.ConnectionState@3e96bacf
>>>>> 2016-07-11 15:46:45,769 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>>> Opening socket connection to server QA-E8WIN12/10.11.8.232:2181. Will
>>>>> not attempt to authenticate using SASL (unknown error)
>>>>> Successfully established the connection with ZooKeeper
>>>>> 2016-07-11 15:47:08,300 INFO  org.apache.zookeeper.ClientCnxn:1251 -
>>>>> Client session timed out, have not heard from server in 22534ms for
>>>>> sessionid 0x0, closing socket connection and attempting reconnect
>>>>> 2016-07-11 15:47:08,403 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>>> not attempt to authenticate using SASL (unknown error)
>>>>> 2016-07-11 15:47:08,406 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>>> Socket connection established, initiating session, client: /
>>>>> 10.11.12.4:47004, server: QA-E8WIN11/10.11.8.236:2181
>>>>> 2016-07-11 15:47:08,413 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>> sessionid = 0x100024f0936005b, negotiated timeout = 45000 (READ-ONLY
mode)
>>>>> 2016-07-11 15:47:08,415 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200
>>>>> 2016-07-11 15:47:08,420 INFO
>>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>>> change: READ_ONLY
>>>>> 2016-07-11 15:49:15,712 INFO  org.apache.zookeeper.ClientCnxn:1253 -
>>>>> Unable to read additional data from server sessionid 0x100024f0936005b,
>>>>> likely server has closed socket, closing socket connection and attempting
>>>>> reconnect
>>>>> 2016-07-11 15:49:15,815 INFO
>>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>>> change: SUSPENDED
>>>>> 2016-07-11 15:49:17,313 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>>> not attempt to authenticate using SASL (unknown error)
>>>>> 2016-07-11 15:49:17,314 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>>> Socket connection established, initiating session, client: /
>>>>> 10.11.12.4:47295, server: QA-E8WIN11/10.11.8.236:2181
>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>> sessionid = 0x100024f0936005f, negotiated timeout = 45000 (READ-ONLY
mode)
>>>>> 2016-07-11 15:49:17,316 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400
>>>>> 2016-07-11 15:49:17,316 INFO
>>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>>> change: READ_ONLY
>>>>> 2016-07-11 15:51:24,607 INFO  org.apache.zookeeper.ClientCnxn:1253 -
>>>>> Unable to read additional data from server sessionid 0x100024f0936005f,
>>>>> likely server has closed socket, closing socket connection and attempting
>>>>> reconnect
>>>>> 2016-07-11 15:51:24,713 INFO
>>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>>> change: SUSPENDED
>>>>> 2016-07-11 15:51:26,467 INFO  org.apache.zookeeper.ClientCnxn:1138 -
>>>>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will
>>>>> not attempt to authenticate using SASL (unknown error)
>>>>> 2016-07-11 15:51:26,468 INFO  org.apache.zookeeper.ClientCnxn:980 -
>>>>> Socket connection established, initiating session, client: /
>>>>> 10.11.12.4:47576, server: QA-E8WIN11/10.11.8.236:2181
>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1400 -
>>>>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181,
>>>>> sessionid = 0x100024f09360061, negotiated timeout = 45000 (READ-ONLY
mode)
>>>>> 2016-07-11 15:51:26,470 INFO  org.apache.zookeeper.ClientCnxn:1297 -
>>>>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 800
>>>>> 2016-07-11 15:51:26,470 INFO
>>>>>  org.apache.curator.framework.state.ConnectionStateManager:228 - State
>>>>> change: READ_ONLY
>>>>> (...)
>>>>>
>>>>> Thanks
>>>>> Ben
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message