hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4105) Capacity Scheduler headroom for DRF is wrong
Date Fri, 04 Sep 2015 15:49:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14730978#comment-14730978
] 

Hudson commented on YARN-4105:
------------------------------

FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #352 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/352/])
YARN-4105. Capacity Scheduler headroom for DRF is wrong. Contributed by Chang Li (jlowe: rev
6eaca2e3634a88dc55689e8960352d6248c424d9)
* hadoop-yarn-project/CHANGES.txt
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/TestCapacityScheduler.java
* hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/LeafQueue.java


> Capacity Scheduler headroom for DRF is wrong
> --------------------------------------------
>
>                 Key: YARN-4105
>                 URL: https://issues.apache.org/jira/browse/YARN-4105
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0
>            Reporter: Chang Li
>            Assignee: Chang Li
>             Fix For: 2.7.2
>
>         Attachments: YARN-4105.2.patch, YARN-4105.3.patch, YARN-4105.4.patch, YARN-4105.patch
>
>
> relate to the problem discussed in YARN-1857. But the min method is flawed when we are
using DRC. Have run into a real scenario in production where queueCapacity: <memory:1056256,
vCores:3750>, qconsumed: <memory:1054720, vCores:361>, consumed: <memory:125952,
vCores:170> limit: <memory:214016, vCores:755>.  headRoom calculation returns 88064
where there is only 1536 left in the queue because DRC effectively compare by vcores. It then
caused deadlock because RMcontainer allocator thought there is still space for mapper and
won't preempt a reducer in a full queue to schedule a mapper. Propose fix with componentwiseMin.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message