hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-624) Support gang scheduling in the AM RM protocol
Date Tue, 21 May 2013 17:01:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663125#comment-13663125

Carlo Curino commented on YARN-624:

I have two level of comments, the first is to clarify the intent of my earlier messages, and
the second one to match robert description of a use case for ML frameworks.

[~vinodkv], I completely agree with you that we should be very deliberate in choosing what
use cases to support and make sure we only add features that target concrete and I would argue
imminent use cases. 
Reflecting on a conversation I had with Alejandro, I was trying to help this conversation
to take this form:
1) push for a broad discussion of what are the use cases for gang-scheduling we know of, so
that we understand the entire complexity of the problem (hence the comments around more advanced
feature such as OR of gangs)
2) let a set of core features emerge from the most concrete short-term needs we have (the
storm example is a good example of where to start for this)
3) try to devise a protocol that supports the core features well, but that is amenable to
future expansions (inasmuch as we can guess our future needs based on 1)
So in term of concrete actions I am totally aligned with your request for "groundedness",
but I think it would really benefit us to spell out also some of the future requirements 
so that we have a chance to designed for extensibility (similarly to what you guys pushed
for in YARN-45, which I thought was really a good call).

ML Use Cases:
I asked Markus Weimer (ML/Systems guy in our group) to summarize why he sees gang scheduling
to be key for ML frameworks (which I think are going to flock into yarn in the coming months/years).

Here his response:
"In many iterative algorithms, it is imperative to load all the data into the main memory
to minimize execution time. This is true for systems like Giraph, Mahout and many others that
will over time be on YARN. In order to satisfy their memory requirement, they will block holding
on to idle slots until YARN has delivered all the resources needed. Exposing that pattern
via gang scheduling seems beneficial.
Furthermore, these systems are often communications intensive. Hence, they’d benefit from
a gang of containers that are collocated on the network. This is a gang-wide property of the
resource ask that cannot be captured easily without gang scheduling. The alternatives (e.g.
getting a container on each rack, then expand from there to see which rack “wins”) are
quite wasteful in comparison.
Lastly, scheduling with alternatives at the gang level would be helpful. If e.g. the training
data for a machine learning algorithm needs 128GB of RAM, any combination of containers with
that amount of RAM would satisfy the need. However, preference is given to fewer machines
as that reduces the communication overhead."

While I appreciate the level of urgency for what Markus describe and for Storm is not comparable,
I see ML as an important future use case for YARN. And gang-scheduling seems one of those
features that will determine whether people build on Yarn or on something like Mesos.

> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a scheduler runs
a set of tasks when they can all be run at the same time, would be a useful feature for YARN
schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they get all the
ones they need.  However, this lends itself to deadlocks when different AMs are waiting on
the same containers.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message