hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-45) Scheduler feedback to AM to release containers
Date Wed, 10 Apr 2013 21:47:16 GMT

    [ https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628305#comment-13628305

Carlo Curino commented on YARN-45:

Alejandro, thanks for the feedback, and yes you are spot on. I think what you propose is akin
to the Set<ResourceRequest> we have (which is similar if I understand correctly to the
PreemptResource thing you describe). We plan to support this, and it does cover one set of
use cases very well, i.e., when we have a "broad" request and we are ok with the AM resolving
this as it see fit. As you point out this is good because it allows the AM to be smart about
what to return and thus more likely to save expensive preemptions in favor of cheap ones,
or even return a container which is not data-local in place of one that is data-local etc...

However, this feels contrived when we know precisely what we want back from a certain AM (e.g.,
we want to preempt a specific container). To this purpose the Set<ContainerID> -based
preemption is easier to use, and also simplifies the bookeeping done in the RM (in our preemption
policy), to decide when to "kill" a container if the AM does not preempt it within a certain
timeout. This is a good match with the FairScheduler internals and we adapted CapacityScheduler
to leverage this too by means of a preemption monitor.  This will be more clear when we release
the actual monitor (in the next few days) but the idea is that if we talk to the AM in terms
of a Set<ContainerID> there is no ambiguity to detect when the AM is ignoring us, and
thus we have to move on with container killing (e.g., to enforce capacity/fairness).  On the
contrary using ResourceRequest or something like that, we might not know whether the resource
I want back now is the same I wanted in some previous iteration (hence i am being ignored
by the AM) or they just happen to be the same/similar. 

If we can devise a simple way to leverage a single resource-based representation for both
scenarios I would be happy to drop the Set<ContainerID>, but so far we haven't found
a clean way to do it, so we provisioned for both Set<ResourceRequest> and/or Set<ContainerID>
to be optionally part of a PreemptRequest. The current semantic is that these are disjoint
sets of resources we want (some called-out as containers, and some expressed as resources),
but we don't have a strong reason for this not to be a tagged union.

Do you think the above covers the use case you have in mind or am I missing something? (BTW
I am very curious to hear what's your use case).

> Scheduler feedback to AM to release containers
> ----------------------------------------------
>                 Key: YARN-45
>                 URL: https://issues.apache.org/jira/browse/YARN-45
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Chris Douglas
>            Assignee: Carlo Curino
>         Attachments: YARN-45.patch
> The ResourceManager strikes a balance between cluster utilization and strict enforcement
of resource invariants in the cluster. Individual allocations of containers must be reclaimed-
or reserved- to restore the global invariants when cluster load shifts. In some cases, the
ApplicationMaster can respond to fluctuations in resource availability without losing the
work already completed by that task (MAPREDUCE-4584). Supplying it with this information would
be helpful for overall cluster utilization [1]. To this end, we want to establish a protocol
for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message