curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Putters (JIRA)" <>
Subject [jira] [Commented] (CURATOR-229) No retry on DNS lookup failure
Date Thu, 02 Jul 2015 11:49:04 GMT


Michael Putters commented on CURATOR-229:

The stacktrace:

ERROR o.a.c.f.imps.CuratorFrameworkImpl - Background exception was not retry-able or retry
gave up
  at ~[na:1.7.0_67]
  at ~[na:1.7.0_67]
  at ~[na:1.7.0_67]
  at org.apache.zookeeper.client.StaticHostProvider.<init>(
  at org.apache.zookeeper.ZooKeeper.<init>( ~[zookeeper-3.4.6.jar:3.4.6-1569965]
  at org.apache.curator.utils.DefaultZookeeperFactory.newZooKeeper(
  at org.apache.curator.framework.imps.CuratorFrameworkImpl$2.newZooKeeper(
  at org.apache.curator.HandleHolder$1.getZooKeeper( ~[curator-client-2.7.0.jar:na]
  at org.apache.curator.HandleHolder.getZooKeeper( ~[curator-client-2.7.0.jar:na]
  at org.apache.curator.ConnectionState.reset( ~[curator-client-2.7.0.jar:na]
  at org.apache.curator.ConnectionState.checkTimeouts( ~[curator-client-2.7.0.jar:na]
  at org.apache.curator.ConnectionState.getZooKeeper( ~[curator-client-2.7.0.jar:na]
  at org.apache.curator.CuratorZookeeperClient.getZooKeeper(
  at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(
  at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(
  at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$400(
  at org.apache.curator.framework.imps.CuratorFrameworkImpl$
  at [na:1.7.0_67]
  at java.util.concurrent.ThreadPoolExecutor.runWorker( [na:1.7.0_67]
  at java.util.concurrent.ThreadPoolExecutor$ [na:1.7.0_67]
  at [na:1.7.0_67]

> No retry on DNS lookup failure
> ------------------------------
>                 Key: CURATOR-229
>                 URL:
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 2.7.0
>            Reporter: Michael Putters
> Our environment is setup so that host names (rather than IP addresses) are used when
registering services.
> When disconnecting a node from the network, it will attempt to reconnect and - in order
to do this - attempts to resolve a host name, which fails (since we have no network connectivity
and a DNS server is used).
> It appears this type of exception is no retryable, and the node simply gives up and never
reconnects, even when the network connectivity is back.
> Is this the expected behavior? Is there any way to configure Curator so that this type
of exception is retryable? I had a look at {{}} around line 768 but
there doesn't seem to be anything configurable.
> If this is not the expected behavior (or if it is but you don't mind making it configurable),
I should be able to provide a patch via a pull request.

This message was sent by Atlassian JIRA

View raw message