curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Grove (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-40) Curator client cannot connect after one zookeeper host shuts down on EC2
Date Mon, 22 Jul 2013 20:20:48 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715596#comment-13715596
] 

Andy Grove commented on CURATOR-40:
-----------------------------------

I think you are correct. Here is the stack trace:

 java.net.UnknownHostException: this.host.does.not.exist2181: nodename nor servname provided,
or not known
	at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
	at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:894)
	at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1286)
	at java.net.InetAddress.getAllByName0(InetAddress.java:1239)
	at java.net.InetAddress.getAllByName(InetAddress.java:1155)
	at java.net.InetAddress.getAllByName(InetAddress.java:1091)
	at org.apache.zookeeper.client.StaticHostProvider.<init>(StaticHostProvider.java:60)
	at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:445)
	at com.netflix.curator.utils.DefaultZookeeperFactory.newZooKeeper(DefaultZookeeperFactory.java:27)
	at com.netflix.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(CuratorFrameworkImpl.java:166)
	at com.netflix.curator.HandleHolder$1.getZooKeeper(HandleHolder.java:94)
	at com.netflix.curator.HandleHolder.getZooKeeper(HandleHolder.java:55)
	at com.netflix.curator.ConnectionState.reset(ConnectionState.java:210)
	at com.netflix.curator.ConnectionState.start(ConnectionState.java:124)
	at com.netflix.curator.CuratorZookeeperClient.start(CuratorZookeeperClient.java:182)
	at com.netflix.curator.framework.imps.CuratorFrameworkImpl.start(CuratorFrameworkImpl.java:231)

The workaround we have implemented, which isn't ideal, is to do DNS resolution on all host
names and then re-write the connection string using IP addresses before passing the connect
string to Curator. Obviously we exclude any hosts that we cannot resolve via DNS.

                
> Curator client cannot connect after one zookeeper host shuts down on EC2
> ------------------------------------------------------------------------
>
>                 Key: CURATOR-40
>                 URL: https://issues.apache.org/jira/browse/CURATOR-40
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 2.0.1-incubating
>         Environment: Ubuntu instances on Amazon EC2 using DNS host names
>            Reporter: Andy Grove
>         Attachments: ApacheCuratorUnknownHostTest.java
>
>
> We use DNS names on Amazon EC2 to specify Zookeeper host names. If one of the ZK hosts
shuts down or loses network connectivity we can no longer connect via Curator, even though
the other ZK hosts are still running and have quorum. The issue is specific to an UnknownHostException
being thrown on DNS resolution when calling the start() method on CuratorZookeeperClient.
The workaround is for us to use IP addresses rather than DNS names, but this isn't really
workable on EC2 since IP addresses change when servers restart so we use Elastic IPs to ensure
that the ZK hosts have fixed IP addresses.
> I have attached a unit test which demonstrates the issue and provides more detail in
the comments.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message