hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12125) Retrying UnknownHostException on a proxy does not actually retry hostname resolution
Date Fri, 26 Jun 2015 21:58:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603668#comment-14603668

Jason Lowe commented on HADOOP-12125:

This is very similar to HDFS-8068, but that tries to work around the issue at the application
layer when it tries to setup the proxy.  Ideally this should be handled as much as possible
in the IPC layer itself so we can treat UnknownHostException like other retriable exceptions.

One possible approach is to have the Connection constructor try to call the updateAddress
method if the ConnectionId socket address is unresolved.  Then we would actually try to re-resolve
the address.  One downside to this appraoch is we could end up with multiple clients for the
same server, since the ConnectionId socket address is used as part of the hashcode.  However
this seems better than either retrying forever for no benefit or requiring app-level code
to retry this code on their own when setting up the proxy.

> Retrying UnknownHostException on a proxy does not actually retry hostname resolution
> ------------------------------------------------------------------------------------
>                 Key: HADOOP-12125
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12125
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>            Reporter: Jason Lowe
> When RetryInvocationHandler attempts to retry an UnknownHostException the hostname fails
to be resolved again.  The InetSocketAddress in the ConnectionId has cached the fact that
the hostname is unresolvable, and when the proxy tries to setup a new Connection object with
that ConnectionId it checks if the (cached) resolution result is unresolved and immediately
> The end result is we sleep and retry for no benefit.  The hostname resolution is never
attempted again.

This message was sent by Atlassian JIRA

View raw message