hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-5077) Fix FSLeafQueue#getFairShare() for queues with weight 0.0
Date Tue, 17 May 2016 22:57:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-5077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15287817#comment-15287817

Karthik Kambatla commented on YARN-5077:

This has been a long standing inconvenience. Thanks for working on this. 

High-level comment: IIUC, we want to consider queues with zero weight only when computing
instantaneous fairshare. And, IIRR, only active apps are passed to compute-instantaneous-shares.
So, we probably don't have to check if a queue is active. That said when computing instantaneous
fairshares, we could check if any of the queues have a non-zero weight. 

Other minor comments on the patch:
# Instead of double negation in the variable name, can we pass {{forceWeightToOne}} to {{ComputeShares#computeShare}}
and {{allWeightsZero}} to {{resourceUsedWithWeightToResourceRatio}}?
# The method to check weights itself could be {{areAllWeightsZero}}

The test is pretty neat. I cringe every time I see the xml form of the FairScheduler allocations
file in tests, but we already have many of them. Filed YARN-5016 for that. 

> Fix FSLeafQueue#getFairShare() for queues with weight 0.0
> ---------------------------------------------------------
>                 Key: YARN-5077
>                 URL: https://issues.apache.org/jira/browse/YARN-5077
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Yufei Gu
>            Assignee: Yufei Gu
>         Attachments: YARN-5077.001.patch, YARN-5077.002.patch
> 1) When a queue's weight is set to 0.0, FSLeafQueue#getFairShare() returns <memory:0,
> 2) When a queue's weight is nonzero, FSLeafQueue#getFairShare() returns <memory:16384,
> In case 1), that means no container ever gets allocated for an AM because from the viewpoint
of the RM, there is never any headroom to allocate a container on that queue.
> For example, we have a pool with the following weights: 
> - root.dev 0.0 
> - root.product 1.0
> The root.dev is a best effort pool and should only get resources if root.product is not
running. In our tests, with no jobs running under root.product, jobs started in root.dev queue
stay stuck in ACCEPT phase and never start.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org

View raw message