hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sidharta Seethana (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2140) Add support for network IO isolation/scheduling for containers
Date Tue, 10 Mar 2015 18:18:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355377#comment-14355377

Sidharta Seethana commented on YARN-2140:

You are right - there are several areas to think about here and we definitely need to put
in more thought w.r.t scheduling. In order to be able to do effective scheduling for network
resources, we would need to understand a) the overall network topology in place for the cluster
in question - characteristics of the ‘route’ between any two nodes in the cluster - number
of hops required and the available/max bandwidth at each point in the route. b) application
characteristics w.r.t network utilization - internal/external traffic, latency vs. bandwidth
sensitivities etc. With regards to inbound traffic, we currently do not have a good way to
do effectively manage traffic - when inbound packets are being ‘examined’ on a given node,
they have already consumed bandwidth along the way - and the only option we have is to drop
it immediately (we cannot queue on the inbound side) or let it through - the design document
mentions these limitations. One possible approach here could be to let the application provide
‘hints’  for inbound network utilization (not all applications might be able to do this)
and use this information purely for scheduling purposes. This, of course, adds more complexity
to scheduling. 

Needless to say, there are hard problems to solve here - and the (network) scheduling requirements
(and potential approaches for implementation) will need further looking into. As a first step,
though, I think it makes sense to focus on classification of outbound traffic (net_cls) and
maybe basic isolation/enforcement + collection of metrics. Once we have this in place - we
could look at real utilization patterns and decide what the next steps should be. 

> Add support for network IO isolation/scheduling for containers
> --------------------------------------------------------------
>                 Key: YARN-2140
>                 URL: https://issues.apache.org/jira/browse/YARN-2140
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wei Yan
>            Assignee: Wei Yan
>         Attachments: NetworkAsAResourceDesign.pdf

This message was sent by Atlassian JIRA

View raw message