hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-3769) Preemption occurring unnecessarily because preemption doesn't consider user limit
Date Tue, 17 Nov 2015 21:36:11 GMT

     [ https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Eric Payne updated YARN-3769:
    Attachment: YARN-3769-branch-2.7.006.patch

[~leftnoteasy], thanks for your comments.
The problem is getUserResourceLimit is not always updated by scheduler. If a queue is not
traversed by scheduler OR apps of a queue-user have long heartbeat interval, the user resource
limit could be staled.
Got it
I found 0005 patch for trunk is computing user-limit every time and 0005 patch for 2.7 is
using getUserResourceLimit.
Yes, I was concerned about using the 2.7 version of {{computeUserLimit}}. It is different
than the branch-2 and trunk versions, and it expects a {{required}} parameter which, in 2.7,
is calculated in {{assignContainers}}  based on an app's capability requests for a given container
priority. I noticed that in branch-2 and trunk, it looks like this {{required}} parameter
is just given the value of {{minimumAllocation}}.

So, in {{YARN-3769-branch-2.7.006.patch}} I passed {{minimumAllocation}} in the {{required}}
parameter of {{computeUserLimit}}.

> Preemption occurring unnecessarily because preemption doesn't consider user limit
> ---------------------------------------------------------------------------------
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-3769-branch-2.002.patch, YARN-3769-branch-2.7.002.patch, YARN-3769-branch-2.7.003.patch,
YARN-3769-branch-2.7.005.patch, YARN-3769-branch-2.7.006.patch, YARN-3769.001.branch-2.7.patch,
YARN-3769.001.branch-2.8.patch, YARN-3769.003.patch, YARN-3769.004.patch, YARN-3769.005.patch
> We are seeing the preemption monitor preempting containers from queue A and then seeing
the capacity scheduler giving them immediately back to queue A. This happens quite often and
causes a lot of churn.

This message was sent by Atlassian JIRA

View raw message