hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4108) CapacityScheduler: Improve preemption to preempt only those containers that would satisfy the incoming request
Date Fri, 15 Jan 2016 04:14:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15101166#comment-15101166

Wangda Tan commented on YARN-4108:

Thanks for looking at this, [~eepayne].

bq. In the lazy preemption case, PCPP will send an event to the scheduler to mark a container
killable. Can PCPP check if it's already been marked before sending, so that maybe event traffic
will be less in the RM?
Agree, we can create a killable map similar to preempted-map in PCPP

bq. Currently, if both queueA and queueB are over their guaranteed capacity, preemption will
still occur if queueA is more over capacity than queueB. I think it is probably important
to preserve this behavior (YARN-2592).
Thank for pointing me this patch, quick read comments on YARN-2592. I think we can still keep
the same behavior in the new proposal: currently I assume only queue with usage less than
guranteed can preempt containers from others, but we can relax this limit to: queue doesn't
have to-be-preempted containers could preempt from others.
However, I think allowing two over-satisfied queues shooting at each other may not reasonable,
if we have 3 queues configured to, a=10, b=20, c=70. when c uses nothing, we cannot simply
interpret a's new capacity = 33 and b's new capacity = 66. (a:b = 10:20). Since admin only
configured capacities of a/b to 10/20, we should strictly follow what admin configured.

bq. don't see anyplace where ResourceLimits#isAllowPreemption is called. But, if it is, Will
the following code in LeafQueue change preemption behavior?...
Yes, LeafQueue decides an app could kill containers or not. And app will use it in {{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.allocator.RegularContainerAllocator#assignContainer}}
for deciding {{toKillContainers}}.

bq. I'm just trying to understand how things will be affected when headroom for a parent queue
is (limit - used) + killable. Doesn't that say that a parent queue has more headroom than
it's already acutally using? Is it relying on this behavior so that the assignment code will
determine that it has more headroom when there are killable containers, and then rely on the
leafqueue to kill those containers?
I'm not sure if I understand your question properly, let me trying to explain this behavior:

ParentQueue will add its own killable container to headroom (getTotalKillableResource is a
bad naming, it should be {{getTotalKillableResourceForThisQueue}}). Since these containers
are all belongs to the parent queue, it has rights to kill all of them to satisfy max-queue-capacity.
Killable container will be actually killed in two cases:
- An under-satisfied leaf queue trying to allocate on a node, but the node doesn't have enough
resources, so it will kill containers *on the node* to allocate the new container
- A queue who is using more than max-capacity, an it has killable container, we will try to
kill containers for such queues to make sure it doesn't violate max-capacity. You can check
following code in ParentQueue#allocateResource:
    // check if we need to kill (killable) containers if maximum resource violated.
    if (getQueueCapacities().getAbsoluteMaximumCapacity(nodePartition)
        < getQueueCapacities().getAbsoluteUsedCapacity(nodePartition)) {
      killContainersToEnforceMaxQueueCapacity(nodePartition, clusterResource);
bq. NPE if getChildQueues() returns null
Nice catching, updated locally

bq. CSAssignment#toKillContainers: I would call them containersToKill
Agree, updated locally 

bq. It would be interesting to know what your thoughts are on making further modifications
to PCPP to make more informed choices about which containers to kill.
I don't have clear ideas for this, a rough idea in my mind is, we could adding some field
to scheduler to indicate some special request (e.g. large/hard-locality, etc.) is starving
and head-of-line (HOL). And doing scan in PCPP at background, after PCPP marks container-to-be-preempted,
we can leverage marked starving-and-HOL request to modify existing marked to-be-preempted
Again, this is a rough thinking, I'm not sure if it is doable.

> CapacityScheduler: Improve preemption to preempt only those containers that would satisfy
the incoming request
> --------------------------------------------------------------------------------------------------------------
>                 Key: YARN-4108
>                 URL: https://issues.apache.org/jira/browse/YARN-4108
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-4108-design-doc-V3.pdf, YARN-4108-design-doc-v1.pdf, YARN-4108-design-doc-v2.pdf,
YARN-4108.poc.1.patch, YARN-4108.poc.2-WIP.patch
> This is sibling JIRA for YARN-2154. We should make sure container preemption is more
> *Requirements:*:
> 1) Can handle case of user-limit preemption
> 2) Can handle case of resource placement requirements, such as: hard-locality (I only
want to use rack-1) / node-constraints (YARN-3409) / black-list (I don't want to use rack1
and host\[1-3\])
> 3) Can handle preemption within a queue: cross user preemption (YARN-2113), cross applicaiton
preemption (such as priority-based (YARN-1963) / fairness-based (YARN-3319)).

This message was sent by Atlassian JIRA

View raw message