hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-8198) Support multiple network interfaces
Date Wed, 11 Apr 2012 16:29:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-8198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251709#comment-13251709
] 

Eli Collins commented on HADOOP-8198:
-------------------------------------

@Nathan, thanks for chiming in, answers follow..

- Wiring up multiple interfaces does mean you need 2x the port count, more cable management
issues, and potentially additional switch configuration. That's true today for people who
use host-level bonding.
- For use case #1 supporting multiple interfaces is like supporting multiple of any host resource
(eg disks). You get improved performance and the ability to tolerate more failures at the
cost of additional code complexity. We already have to tolerate client <-> worker connection
failures, we can leave the current behavior as is, or attempt to better tolerate them by eg
working around them (eg see HDFS-3149). Like tolerating disk failures this means some hosts
may more resources than others (if by default only one interface is reported then this only
affects the multi-interface case). I'm also considering the impact on MR, where you'd want
the shuffle to be able to take advantage of this as well, and more importantly, if it didn't
then you could potentially have more imbalanced network traffic.
- For use case #2 supporting multiple interfaces is simpler because clients don't necessarily
get multiple interfaces, different clients just end up getting different interfaces, in the
same way the NN can bind to the wildcard today, causing it to be available on multiple interfaces,
and clients can access it via any of them. Note that both are independent, you can support
#2 w/o #1 and vice versa.
- Wrt host-level bonding and 10gige, see my comment above to Sanjay, these both help use case
#1, they don't address use case #2, the primary motivation.
                
> Support multiple network interfaces
> -----------------------------------
>
>                 Key: HADOOP-8198
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8198
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: io, performance
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: MultipleNifsv1.pdf, MultipleNifsv2.pdf, MultipleNifsv3.pdf
>
>
> Hadoop does not currently utilize multiple network interfaces, which is a common user
request, and important in enterprise environments. This jira covers a proposal for enhancements
to Hadoop so it better utilizes multiple network interfaces. The primary motivation being
improved performance, performance isolation, resource utilization and fault tolerance. The
attached design doc covers the high-level use cases, requirements, a proposal for trunk/0.23,
discussion on related features, and a proposal for Hadoop 1.x that covers a subset of the
functionality of the trunk/0.23 proposal.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message