hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun C Murthy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers
Date Tue, 16 Apr 2013 05:09:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13632571#comment-13632571
] 

Arun C Murthy commented on YARN-45:
-----------------------------------

Sorry, I've been away for a couple of weeks due to family reasons and I'm just catching up.

The bare-minimum requirement seems:
# RM should notify the AM that a certain amount of resources will need to be reclaimed (ala
SIGTERM).
# Thus, the AM gets an opportunity to *pick* which containers it will sacrifice to satisfy
the RM's requirements.
# Iff the AM doesn't act, the RM will go ahead and terminate some containers (probably the
most-recently allocated ones); ala SIGKILL.

Given the above, I feel that this is a set of changes we need to be conservative about - particularly
since the really simple pre-emption i.e. SIGKILL alone on RM side is trivial (from an API
perspective).

Thus, I'm concerned about jumping into a complex preemption API (ResourceRequest etc.) without
having sufficient experience i.e. doing this in the first iteration itself.

I like [~tucu00]'s initial suggestion of: 
# Resource resourcesToReclaim
# Optionally, a Set<ContainerId> which the RM will preempt i.e. SIGKILL 

In fact, for the first iteration, Set<ContainerId> is something we can avoid if the
semantics are clear i.e. RM will preempt the most-recently allocated containers.

Once we have sufficient experience with this, we can then dive deeper to think about further
enhancements to the API by adding features (in a compatible manner for 2.x or 3.x).

Thoughts? 
                
> Scheduler feedback to AM to release containers
> ----------------------------------------------
>
>                 Key: YARN-45
>                 URL: https://issues.apache.org/jira/browse/YARN-45
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Chris Douglas
>            Assignee: Carlo Curino
>         Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict enforcement
of resource invariants in the cluster. Individual allocations of containers must be reclaimed-
or reserved- to restore the global invariants when cluster load shifts. In some cases, the
ApplicationMaster can respond to fluctuations in resource availability without losing the
work already completed by that task (MAPREDUCE-4584). Supplying it with this information would
be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol
for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message