hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nathan Roberts (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4963) capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
Date Wed, 20 Apr 2016 15:57:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250137#comment-15250137

Nathan Roberts commented on YARN-4963:

bq. IMO, I think application specific configurations should be there rather at scheduler level.
Some applications are fine with assigning containers in off_switch they can specify number
of containers to be assigned. But few applications are very strict to node locality, they
can configure 1 in off_switch.

bq. Even i feel the same, any specfic reason it has been set only at the scheduler level other
than the AMRM interface change ? We can keep the default value as 1 so that its still compatible.
Also anyway allocation happens within app's & queue's capacity limits so i feel it would
be ideal for app to decide how many allocations in off_switch node. thoughts ?

Thanks [~Naganarasimha], [~rohithsharma], [~leftnoteasy] for the comments. I think we're all
in agreement that there needs to be some control at the application level for things like
OFF_SWITCH allocations, and locality delays (That's what #2 was going for and I think that
should be a separate jira if folks are agreeable to that.) This new feature will require some
- The current value of 1 is not a good value for almost all applications so I think when we
do the application-level support the default would need to be either unlimited or some high
value, otherwise we force all applications to set this limit to something other than 1 to
get decent OFF_SWITCH scheduling behavior.
- This setting not only affects the application at hand, but can also affect the entire system.
I can see many cases where applications will relax these settings significantly so that their
application schedules faster, however that may not have been the right thing for the system
as a whole. Sure, my application scheduled very quickly but my locality was terrible so I
caused a lot of unnecessary cross-switch traffic. So I think we'll need some system-minimums
that will prevent this type of abuse. 
- These changes would potentially affect the fifo-ness of the queues. If application A meets
its OFF-SWITCH-per-node limit, do we offer the node to other applications in the same queue?

So my suggestion is:
1) Have this jira make the system-level OFF-SWITCH check  configurable so admins can easily
crank this up and dramatically improve scheduling rate. 
2) Have a second jira to address per-application settings for things like locality_delay and
off_switch limits.


> capacity scheduler: Make number of OFF_SWITCH assignments per heartbeat configurable
> ------------------------------------------------------------------------------------
>                 Key: YARN-4963
>                 URL: https://issues.apache.org/jira/browse/YARN-4963
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>    Affects Versions: 3.0.0, 2.7.2
>            Reporter: Nathan Roberts
>            Assignee: Nathan Roberts
>         Attachments: YARN-4963.001.patch
> Currently the capacity scheduler will allow exactly 1 OFF_SWITCH assignment per heartbeat.
With more and more non MapReduce workloads coming along, the degree of locality is declining,
causing scheduling to be significantly slower. It's still important to limit the number of
OFF_SWITCH assignments to avoid densely packing OFF_SWITCH containers onto nodes. 
> Proposal is to add a simple config that makes the number of OFF_SWITCH assignments configurable.
> Will upload candidate patch shortly.

This message was sent by Atlassian JIRA

View raw message