hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Carlo Curino (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5864) Capacity Scheduler preemption for fragmented cluster
Date Fri, 16 Dec 2016 10:52:59 GMT

    [ https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15754101#comment-15754101
] 

Carlo Curino commented on YARN-5864:
------------------------------------

[~wangda] I like the direction of specifying more clearly what happens. I think working on
a design doc that spells this out would be very valuable, I am happy to review and brainstorm
with you if you think it is useful. (But FYI: I am on parental leave, and traveling abroad
till mid-Jan.)

In writing the document, in particular I think you should address the semantics from all points
of view, e.g., which guarantees do I get as a user of any of the queues (not just the one
we are preempting in favor of)? It is clear that if I am running over-capacity I can be preempted,
but what happens if I am (safely?) within my capacity? (This is related to the "abuses" I
was describing before, e.g., one in which I ask for massive containers on the nodes I want,
and then resize them down, after you have killed anyone in my way).  

Looking further ahead: Ideally, this document you are starting to capture the semantics of
this feature can be expanded to slowly cover all "tunables" of the scheduler, and explore
the many complex interactions among features and the semantics we can derive from that (I
bet we might be able to get rid of some redundancies). This could become part of the documentation
of YARN. Even nicer would be to codify this with SLS driven tests (so that any future feature
will not mess up with the semantics you are capturing, without us noticing).

> Capacity Scheduler preemption for fragmented cluster 
> -----------------------------------------------------
>
>                 Key: YARN-5864
>                 URL: https://issues.apache.org/jira/browse/YARN-5864
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-5864.poc-0.patch
>
>
> YARN-4390 added preemption for reserved container. However, we found one case that large
container cannot be allocated even if all queues are under their limit.
> For example, we have:
> {code}
> Two queues, a and b, capacity 50:50 
> Two nodes: n1 and n2, each of them have 50 resource 
> Now queue-a uses 10 on n1 and 10 on n2
> queue-b asks for one single container with resource=45. 
> {code} 
> The container could be reserved on any of the host, but no preemption will happen because
all queues are under their limits. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message