curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benjamin Jaton (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-355) Curator client fails when connecting to read-only ensemble
Date Mon, 10 Oct 2016 22:17:20 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15563713#comment-15563713
] 

Benjamin Jaton commented on CURATOR-355:
----------------------------------------

So when I connect using ZK API directly with sessionTimeout=45000, and when it picks up the
server that is NOT started first, it takes the ZK client API 22 seconds (45/2?) to try the
second server, which then works and I get my connection.

In contrast Curator seems to wait only connectionTimeout=15000 in blockUntilConnectedOrTimedOut(),
so it seems like it's failing because it's stops trying too early.

> Curator client fails when connecting to read-only ensemble
> ----------------------------------------------------------
>
>                 Key: CURATOR-355
>                 URL: https://issues.apache.org/jira/browse/CURATOR-355
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.11.0
>            Reporter: Benjamin Jaton
>            Priority: Critical
>         Attachments: test2.log
>
>
> ZK is 3.5.1-alpha
> I have a 3 nodes ZK cluster , readonly mode is enabled.
> 2 nodes are down, so one of them (QA-E8WIN11) is in read-only (verified by using the
ZK API manually). All the machines of the ensemble can be pinged from the client.
> I'm using this piece of code:
> {code}
> 		Builder curatorClientBuilder = CuratorFrameworkFactory.builder()
> 				.connectString("QA-E8WIN11:2181,QA-E8WIN12:2181")
> 				.sessionTimeoutMs(45000).connectionTimeoutMs(15000)
> 				.retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true);
> 		CuratorFramework client = curatorClientBuilder.build();
> 		client.start();
> 		client.getZookeeperClient().blockUntilConnectedOrTimedOut();
> 		System.out.println("Successfully established the connection with ZooKeeper");
> 		
> 		client.getData().forPath("/");
> 		System.out.println("Done.");{code}
> When curator pick the host that is UP first, it goes through very quickly. When it picks
the host that is down first (QA-E8WIN12), it seems to be stuck at the getData() call for a
very long time, and then eventually fail with a ConnectionLossException. (see attached log)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message