hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3243) CapacityScheduler should pass headroom from parent to children to make sure ParentQueue obey its capacity limits.
Date Wed, 11 Mar 2015 00:07:39 GMT

    [ https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14355995#comment-14355995
] 

Jian He commented on YARN-3243:
-------------------------------

looks good overall, some  comments:
- {{AbstractCSQueue#getCurrentLimitResource}}
-- add comments about how currentLimitResource is calculated
- getResourceLimitsOfChild  
--  myLimits-> parentLimits
--  myMaxAvailableResource -> parentMaxAvailableResource 
--  childMaxResource -> childConfiguredMaxResource

- setHeadroomInfo -> setQueueResourceLimitsInfo
- needExtraNewOrReservedContainer flag -> better name ? shouldAllocOrReserveNewContainer?
- similarly for the needExtraNewOrReservedContainer method
- revert TestContainerAllocation change
- {{ 1GB (am) + 5GB * 2 = 9GB  }} 5GB should be 4GB
- Do you think passing down a QueueHeadRoom  compared with QueueMaxLimit may make the code
simpler
-   the checkLimitsToReserve may not need to be invoked if we are assigning a reserved container
{code}
if (reservationsContinueLooking) {
//          // we got here by possibly ignoring parent queue capacity limits. If
//          // the parameter needToUnreserve is true it means we ignored one of
//          // those limits in the chance we could unreserve. If we are here
//          // we aren't trying to unreserve so we can't allocate
//          // anymore due to that parent limit
//          boolean res = checkLimitsToReserve(clusterResource,
{code}

> CapacityScheduler should pass headroom from parent to children to make sure ParentQueue
obey its capacity limits.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3243
>                 URL: https://issues.apache.org/jira/browse/YARN-3243
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch
>
>
> Now CapacityScheduler has some issues to make sure ParentQueue always obeys its capacity
limits, for example:
> 1) When allocating container of a parent queue, it will only check parentQueue.usage
< parentQueue.max. If leaf queue allocated a container.size > (parentQueue.max - parentQueue.usage),
parent queue can excess its max resource limit, as following example:
> {code}
>         A  (usage=54, max=55)
>        /     \
>       A1     A2 (usage=1, max=55)
> (usage=53, max=53)
> {code}
> Queue-A2 is able to allocate container since its usage < max, but if we do that, A's
usage can excess A.max.
> 2) When doing continous reservation check, parent queue will only tell children "you
need unreserve *some* resource, so that I will less than my maximum resource", but it will
not tell how many resource need to be unreserved. This may lead to parent queue excesses configured
maximum capacity as well.
> With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, *here is
my proposal*:
> - ParentQueue will set its children's ResourceUsage.headroom, which means, *maximum resource
its children can allocate*.
> - ParentQueue will set its children's headroom to be (saying parent's name is "qA"):
min(qA.headroom, qA.max - qA.used). This will make sure qA's ancestors' capacity will be enforced
as well (qA.headroom is set by qA's parent).
> - {{needToUnReserve}} is not necessary, instead, children can get how much resource need
to be unreserved to keep its parent's resource limit.
> - More over, with this, YARN-3026 will make a clear boundary between LeafQueue and FiCaSchedulerApp,
headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message