curator-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Grove <andy.gr...@codefutures.com>
Subject Curator client fails to connect if any one of my zookeeper instances is down
Date Tue, 25 Jun 2013 17:16:09 GMT
Hi,

I'm using the following code to connect to my zookeeper instances:

            client = CuratorFrameworkFactory.newClient(connectString, sessionTimeout, connectTimeout,new
ExponentialBackoffRetry(1000, 3));

I have three hosts, lets call them host1, host2 and host3. If all hosts are running then everything
works as expected.

If host1 is down (server shut down) then all operations on the curator client fail and I see
errors like this:

ERROR com.netflix.curator.ConnectionState - Connection timed out for connection string (host1:8090,host2:8090,host3:8090)
and timeout (15000) / elapsed (15310)

It doesn't matter what order I specify the hosts in, I always get these errors and my operation
eventually fails with:

     [java] Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode
= ConnectionLoss
     [java] 	at com.netflix.curator.ConnectionState.getZooKeeper(ConnectionState.java:101)
     [java] 	at com.netflix.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:107)
     [java] 	at com.netflix.curator.framework.imps.CuratorFrameworkImpl.getZooKeeper(CuratorFrameworkImpl.java:445)
     [java] 	at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:171)
     [java] 	at com.netflix.curator.framework.imps.ExistsBuilderImpl$2.call(ExistsBuilderImpl.java:160)
     [java] 	at com.netflix.curator.RetryLoop.callWithRetry(RetryLoop.java:106)
     [java] 	at com.netflix.curator.framework.imps.ExistsBuilderImpl.pathInForeground(ExistsBuilderImpl.java:156)
     [java] 	at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:147)
     [java] 	at com.netflix.curator.framework.imps.ExistsBuilderImpl.forPath(ExistsBuilderImpl.java:35)
     [java] 	at com.dbshards.nameserver.ZKClient.createPath(ZKClient.java:406)

I would expect Curator/Zookeeper to try this operation with host2 or host3 after an error
connecting to host1 but this is not the case. I even have a retry loop in my code that tries
the operation 10 times and it fails every time if host1 is in the connect string.

I'm hoping I'm missing something obvious here. Any help would be appreciated.

Thanks,

Andy.

--
Andy Grove
VP, R&D
CodeFutures Corporation

Share Nothing, Shard Everything!
http://www.dbshards.com





Mime
View raw message