hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3275) Preemption happening on non-preemptable queues
Date Sat, 28 Feb 2015 16:27:05 GMT

    [ https://issues.apache.org/jira/browse/YARN-3275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341626#comment-14341626

Eric Payne commented on YARN-3275:

Thanks very much, [~leftnoteasy], for reviewing this issue.
Actually, go over max capacity is possible, when a cluster with resource = 1000G, and a queue
reaches its max capacity, after the cluster resource goes down to 100G, it can over max capacity.
n addition, parent queue can go beyond max capacity as described in YARN-3243 no matter if
cluster resource changed or not. But child queue can only go beyond max capacity when cluster
resource reduced.
It is possible that the total available capacity of the cluster dropped by some percentage,
causing the leaf node to go over its abs max cap by 5%. The cluster has a large number of
nodes and memory, and that value is always changing slightly as nodes are lost and re-register.
This may not account for the 5% overage we saw on the small leaf queue, because that total
memory number isn't varying by 5%.
we haven't defined "disable-preemption" is more important than "max-capacity". IMO, if we
should do this JIRA or not is still discussable.
I see your point. In other words, it could be argued that the preemption monitor is doing
the right thing. That is, when it sees that the queue is over its absolute max capacity (which
should not happen), the preemption monitor is moving those resources back into the usable

However, the expectation of our users is that if they are running a job on a non-preemptable
queue, their containers should never be preempted. From their point of view, it doesn't matter
what the reason is, they are expecting the RM to obey the contract that says it will not preempt
their resources.

> Preemption happening on non-preemptable queues
> ----------------------------------------------
>                 Key: YARN-3275
>                 URL: https://issues.apache.org/jira/browse/YARN-3275
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.7.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-3275.v1.txt
> YARN-2056 introduced the ability to turn preemption on and off at the queue level. In
cases where a queue goes over its absolute max capacity (YARN-3243, for example), containers
can be preempted from that queue, even though the queue is marked as non-preemptable.
> We are using this feature in large, busy clusters and seeing this behavior.

This message was sent by Atlassian JIRA

View raw message