hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-624) Support gang scheduling in the AM RM protocol
Date Wed, 07 Aug 2013 17:28:52 GMT

    [ https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13732188#comment-13732188

Steve Loughran commented on YARN-624:

[~bikassaha] has pointed out that gang scheduling can be implemented in an AM today: it can
just hang on to assigned nodes until a minimum has been allocated, at which point it can bring
up its service.

This would move all scheduling decisions into the AM, which is free to implement it own policy.

If we start this way, then once a few apps have done this we can look at commonality and decide
what features -if any- need to go into YARN.

The most likely - again, credit to Bikas - is for YARN to recognise that an AM has been given
some containers but not, after a time period, deployed anything to them. YARN could then cancel
the lease unless the AM specifically indicates it wants to retain it.
> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a scheduler runs
a set of tasks when they can all be run at the same time, would be a useful feature for YARN
schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they get all the
ones they need.  However, this lends itself to deadlocks when different AMs are waiting on
the same containers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message