hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Suresh Srinivas (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7472) RPC client should deal with the IP address changes
Date Wed, 27 Jul 2011 21:05:09 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072006#comment-13072006
] 

Suresh Srinivas commented on HADOOP-7472:
-----------------------------------------

bq. We are working on a new approach, which will address both 1 and 2.
Can you add more details.

bq. They don't. It's that they create an InetSocketAddress and the lower layers have no way
of knowing what it was originally instantiated with. This is a headache when dealing with
tokens.
You could consider host name from the InetSocketAddress as the destination identification.
I am not sure what you mean by "originally instantiated with" and why treating InetSocketAddress
as carrier of host name information will not work.

bq. This is a headache when dealing with tokens.
I am not sure what the headache. I will ask Jitendra to comment on this from Security perspective.

bq. We thought about this. Darren is working on the token renewal problem and we found out
we can have a common solution. One way was to do what you mentioned. But decided to keep it
as is but use createUnresolved() to create an InetSocketAddress, so that we know what was
used to instantiate it. If the user slapped in an IP address to begin with, we won't handle
it. (I think it was indistinguishable before)

Does this mean, you will have different implementation later? Like I said, we could treat
InetSocketAddress as carrier of hostname and not attach any other semantics to it. This should
be fine because InetSocketAddress is what is passed. And the name(URL) is also resolved to
InetSocketAddress.

bq. The token will have whatever the user used (IP or name) in the beginning and in case of
using name, the key to the token cache won't change even with addr changes. So the delegation
token should continue to work.
Jitendra, any comments on this? My thought is, if you wrap InetSocketAddress, appropriate
key such as name or host name from InetSocketAddress could be used.



> RPC client should deal with the IP address changes
> --------------------------------------------------
>
>                 Key: HADOOP-7472
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7472
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: ipc
>    Affects Versions: 0.20.205.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Minor
>             Fix For: 0.20.205.0
>
>         Attachments: addr_change_dfs-1.patch.txt, addr_change_dfs.patch.txt
>
>
> The current RPC client implementation and the client-side callers assume that the hostname-address
mappings of servers never change. The resolved address is stored in an immutable InetSocketAddress
object above/outside RPC, and the reconnect logic in the RPC Connection implementation also
trusts the resolved address that was passed down.
> If the NN suffers a failure that requires migration, it may be started on a different
node with a different IP address. In this case, even if the name-address mapping is updated
in DNS, the cluster is stuck trying old address until the whole cluster is restarted.
> The RPC client-side should detect this situation and exit or try to recover.
> Updating ConnectionId within the Client implementation may get the system work for the
moment, there always is a risk of the cached address:port become connectable again unintentionally.
The real solution will be notifying upper layer of the address change so that they can re-resolve
and retry or re-architecture the system as discussed in HDFS-34. 
> For 0.20 lines, some type of compromise may be acceptable. For example, raise a custom
exception for some well-defined high-impact upper layer to do re-resolve/retry, while other
will have to restart.  For TRUNK, the HA work will most likely determine what needs to be
done.  So this Jira won't cover the solutions for TRUNK.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message