hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-624) Support gang scheduling in the AM RM protocol
Date Sat, 10 Aug 2013 16:54:50 GMT

    [ https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735967#comment-13735967

Carlo Curino commented on YARN-624:

Related to this is work we just proposed in YARN-1051. We manage dynamically negotiated reservation
of capacity at admission control. The idea is that if I want gang-scheduling I can declare
this at submission time and the system accept me only if it can "fit" me. At that level we
do constraints checking / knapsack (e.g., that we never promise more gang-style reservations
than we can fit). 

This means that at run-time AM-hoarding is ok because we guarantee it to fit. 
I am aware of at least 2 limitations of this approach w.r.t. the dynamic version you were
* doesn't work if the application doesn't know about its needs until the AM has started 
* we lose large chunks of the cluster (and our previously checked constraints don't hold)

Neither seems a great concern, and the second one can be handle with re-planning in the admission-control
(which we don't have yet, but its in our agenda).

> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a scheduler runs
a set of tasks when they can all be run at the same time, would be a useful feature for YARN
schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they get all the
ones they need.  However, this lends itself to deadlocks when different AMs are waiting on
the same containers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message