hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8198) Support multiple network interfaces
Date Mon, 02 Apr 2012 15:55:24 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13244278#comment-13244278
] 

Eli Collins commented on HADOOP-8198:
-------------------------------------

This proposal enables host-level bonding (as part of use case #1). Today host-level bonding
doesn't work because the NN sets the DatanodeID IP field to be the IP where the IPC came from.
So if you configure the IP of a bond in the dfs.datanode.address it is ignored (ie this config
is only used to set the transfer port). HDFS-3146 will allow you to just report the IP of
the bond, thus enabling host-level bonding. Or if you don't want to do host-level bonding
you can just report the IPs of both interfaces. The latter is useful because some users find
host-level bonding a pita to configure and would prefer we use multiple interfaces out of
the box.

Note that host-level bonding is insufficient for use case #2. Suppose a host has 2 interfaces,
one is cluster-private - not routable by clients outside the cluster - and the other is usable
by clients outside the cluster (eg an adjacent cluster or system). You can't bond these two
interfaces, and the NN only advertises one DN IP, so it can only hand out one, which means
only one client will work. You can try to work around this by port-forwarding from the public
interface to the private interface but that defeats the purpose. Alternatively, if the DN
was advertised by hostname then you can get this to work by having on-cluster clients resolve
the hostname to one IP (eg using host files or a local DNS server) and off cluster resolve
it to another (eg they use what's in DNS). This is actually the approach I posted for v1,
but it has some drawbacks (eg lots of extra DNS lookups) and more complex configuration so
I don't think we want to do this for trunk. It's much simpler to be able to report multiple
IPs, and configure which to use.
                
> Support multiple network interfaces
> -----------------------------------
>
>                 Key: HADOOP-8198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8198
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io, performance
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: MultipleNifsv1.pdf, MultipleNifsv2.pdf
>
>
> Hadoop does not currently utilize multiple network interfaces, which is a common user
request, and important in enterprise environments. This jira covers a proposal for enhancements
to Hadoop so it better utilizes multiple network interfaces. The primary motivation being
improved performance, performance isolation, resource utilization and fault tolerance. The
attached design doc covers the high-level use cases, requirements, a proposal for trunk/0.23,
discussion on related features, and a proposal for Hadoop 1.x that covers a subset of the
functionality of the trunk/0.23 proposal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

Mime
View raw message