hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Welch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2008) CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure
Date Wed, 16 Jul 2014 02:42:04 GMT

    [ https://issues.apache.org/jira/browse/YARN-2008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14063021#comment-14063021
] 

Craig Welch commented on YARN-2008:
-----------------------------------

It seems like, to calculate the "actual available" percentage, it would be necessary to walk
up the tree from the queue in question, at each level you would need to take the min of max
allowed % and 100% - actual used percentage of all non-ancestral siblings and multiply it
with the running total so far.  In this example, starting with L2Q2, at the lowest level of
the tree 1 sibling (L2Q1) is using 50%, so min of 100% (L2Q2 didn't have a max, assuming 100%)
and (100% - sum of all other queues (in this case, just 1 L2Q1 at 50%,  = 50%) is 50%, going
up the next step, L1Q1 is the ancestor, all others includes L1Q2 which is using 40%, so at
the L1 level it is min (80% max), (100% - 40% (sum of all siblings, in this case just L1Q2)=60%),
which is 60% * the 50% == 30% (which is what you had, this is just the nuts and bolts / a
detail of how to do it moving up the tree).  [~airbots] Sound reasonable?

> CapacityScheduler may report incorrect queueMaxCap if there is hierarchy queue structure

> -----------------------------------------------------------------------------------------
>
>                 Key: YARN-2008
>                 URL: https://issues.apache.org/jira/browse/YARN-2008
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>    Affects Versions: 2.3.0
>            Reporter: Chen He
>            Assignee: Chen He
>
> If there are two queues, both allowed to use 100% of the actual resources in the cluster.
Q1 and Q2 currently use 50% of actual cluster's resources and there is not actual space available.
If we use current method to get headroom, CapacityScheduler thinks there are still available
resources for users in Q1 but they have been used by Q2. 
> If the CapacityScheduelr has a hierarchy queue structure, it may report incorrect queueMaxCap.
Here is a example
>                              ||                    ||rootQueue||     ||
> |  |                               /                               |                
   \                     |
> |      L1ParentQueue1                      |          |            L1ParentQueue2   
|
> |  (allowed to use up 80% of its parent)    |  |         (allowed to use 20% in minimum
of its parent)|
> |                    /   |     \ |                            |  
> |  L2LeafQueue1 |    L2LeafQueue2 |  |     
> |(50% of its parent) |  (50% of its parent in minimum) |   |
> When we calculate headroom of a user in L2LeafQueue2, current method will think L2LeafQueue2
can use 40% (80%*50%) of actual rootQueue resources. However, without checking L1ParentQueue1,
we are not sure. It is possible that L1ParentQueue2 have used 40% of rootQueue resources right
now. Actually, L2LeafQueue2 can only use 30% (60%*50%). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message