hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andy Isaacson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-6867) Using socket address for datanode registry breaks multihoming
Date Thu, 11 Oct 2012 20:57:03 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andy Isaacson updated HADOOP-6867:
----------------------------------

    Description: 
Related: 
* https://issues.apache.org/jira/browse/HADOOP-985
* https://issues.apache.org/jira/secure/attachment/12350813/HADOOP-985-1.patch
* http://old.nabble.com/public-IP-for-datanode-on-EC2-td19336240.html
* http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/ 

Datanodes register using their dns name (even configurable with dfs.datanode.dns.interface).
However, the Namenode only really uses the source address that the registration came from
when sharing it to clients wanting to write to HDFS.

Specific environment that causes this problem:
* Datanode and Namenode multihomed on two networks.
* Datanode registers to namenode using dns name on network #1
* Client (distcp) connects to namenode on network #2 \(*) and is told to write to datanodes
on network #1, which doesn't work for us.

\(*) Allowing contact to the namenode on multiple networks was achieved with a socat proxy
hack that tunnels network#2 to network#1 port 8020. This is unrelated to the issue at hand.


The cloudera link above recommends proxying for other reasons than multihoming, but it would
work, but it doesn't sound like it would well (bandwidth, multiplicity, multitenant, etc).

Our specific scenario is wanting to distcp over a different network interface than the datanodes
register themselves on, but it would be nice if both (all) interfaces worked. We are internally
going to patch hadoop to roll back parts of the patch mentioned above so that we rely the
datanode name rather than the socket address it uses to talk to the namenode. The alternate
option is to push config changes to all nodes that force them to listen/register on one specific
interface only. This helps us work around our specific problem, but doesn't really help with
multihoming. 

I would propose that datanodes register all interface addresses during the registration/heartbeat/whatever
process does this and hdfs clients would be given all addresses for a specific node to perform
operations against and they could select accordingly (or 'whichever worked first') just like
round-robin dns does.


  was:
Related: 
* https://issues.apache.org/jira/browse/HADOOP-985
* https://issues.apache.org/jira/secure/attachment/12350813/HADOOP-985-1.patch
* http://old.nabble.com/public-IP-for-datanode-on-EC2-td19336240.html
* http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/ 

Datanodes register using their dns name (even configurable with dfs.datanode.dns.interface).
However, the Namenode only really uses the source address that the registration came from
when sharing it to clients wanting to write to HDFS.

Specific environment that causes this problem:
* Datanode and Namenode multihomed on two networks.
* Datanode registers to namenode using dns name on network #1
* Client (distcp) connects to namenode on network #2 (*) and is told to write to datanodes
on network #1, which doesn't work for us.

(*) Allowing contact to the namenode on multiple networks was achieved with a socat proxy
hack that tunnels network#2 to network#1 port 8020. This is unrelated to the issue at hand.


The cloudera link above recommends proxying for other reasons than multihoming, but it would
work, but it doesn't sound like it would well (bandwidth, multiplicity, multitenant, etc).

Our specific scenario is wanting to distcp over a different network interface than the datanodes
register themselves on, but it would be nice if both (all) interfaces worked. We are internally
going to patch hadoop to roll back parts of the patch mentioned above so that we rely the
datanode name rather than the socket address it uses to talk to the namenode. The alternate
option is to push config changes to all nodes that force them to listen/register on one specific
interface only. This helps us work around our specific problem, but doesn't really help with
multihoming. 

I would propose that datanodes register all interface addresses during the registration/heartbeat/whatever
process does this and hdfs clients would be given all addresses for a specific node to perform
operations against and they could select accordingly (or 'whichever worked first') just like
round-robin dns does.


    
> Using socket address for datanode registry breaks multihoming
> -------------------------------------------------------------
>
>                 Key: HADOOP-6867
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6867
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>         Environment: hadoop-0.20-0.20.2+228-1, centos 5, distcp
>            Reporter: Jordan Sissel
>
> Related: 
> * https://issues.apache.org/jira/browse/HADOOP-985
> * https://issues.apache.org/jira/secure/attachment/12350813/HADOOP-985-1.patch
> * http://old.nabble.com/public-IP-for-datanode-on-EC2-td19336240.html
> * http://www.cloudera.com/blog/2008/12/securing-a-hadoop-cluster-through-a-gateway/ 
> Datanodes register using their dns name (even configurable with dfs.datanode.dns.interface).
However, the Namenode only really uses the source address that the registration came from
when sharing it to clients wanting to write to HDFS.
> Specific environment that causes this problem:
> * Datanode and Namenode multihomed on two networks.
> * Datanode registers to namenode using dns name on network #1
> * Client (distcp) connects to namenode on network #2 \(*) and is told to write to datanodes
on network #1, which doesn't work for us.
> \(*) Allowing contact to the namenode on multiple networks was achieved with a socat
proxy hack that tunnels network#2 to network#1 port 8020. This is unrelated to the issue at
hand.
> The cloudera link above recommends proxying for other reasons than multihoming, but it
would work, but it doesn't sound like it would well (bandwidth, multiplicity, multitenant,
etc).
> Our specific scenario is wanting to distcp over a different network interface than the
datanodes register themselves on, but it would be nice if both (all) interfaces worked. We
are internally going to patch hadoop to roll back parts of the patch mentioned above so that
we rely the datanode name rather than the socket address it uses to talk to the namenode.
The alternate option is to push config changes to all nodes that force them to listen/register
on one specific interface only. This helps us work around our specific problem, but doesn't
really help with multihoming. 
> I would propose that datanodes register all interface addresses during the registration/heartbeat/whatever
process does this and hdfs clients would be given all addresses for a specific node to perform
operations against and they could select accordingly (or 'whichever worked first') just like
round-robin dns does.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message