hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kihwal Lee (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HADOOP-7472) RPC client should deal with the IP address changes
Date Mon, 18 Jul 2011 18:57:57 GMT
RPC client should deal with the IP address changes

                 Key: HADOOP-7472
                 URL: https://issues.apache.org/jira/browse/HADOOP-7472
             Project: Hadoop Common
          Issue Type: Improvement
          Components: ipc
    Affects Versions:
            Reporter: Kihwal Lee
            Assignee: Kihwal Lee
            Priority: Minor

The current RPC client implementation and the client-side callers assume that the hostname-address
mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress
object above/outside RPC, and the reconnect logic in the RPC Connection implementation also
trusts the resolved address that was passed down.

If the NN suffers a failure that requires migration, it may be started on a different node
with a different IP address. In this case, even if the name-address mapping is updated in
DNS, the cluster is stuck trying old address until the whole cluster is restarted.

The RPC client-side should detect this situation and exit or try to recover.

Updating ConnectionId within the Client implementation may get the system work for the moment,
there always is a risk of the cached address:port become connectable again unintentionally.
The real solution will be notifying upper layer of the address change so that they can re-resolve
and retry or re-architecture the system as discussed in HDFS-34. 

For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom exception
for some well-defined high-impact upper layer to do re-resolve/retry, while other will have
to restart.  For TRUNK, the HA work will most likely determine what needs to be done.  So
this Jira won't cover the solutions for TRUNK.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message