hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected
Date Mon, 25 Aug 2014 10:23:58 GMT

    [ https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14108991#comment-14108991
] 

Wangda Tan commented on YARN-1198:
----------------------------------

Hi [~cwelch],
Thanks for updating, I went through your patch just now.

I think the current approach makes more sense to me comparing to patch#4, it avoids iterating
all apps when computing headroom. But currently, CapacityHeadroomProvider#getHeadroom will
recompute headroom for each application heartbeat. Assume we have #application >> #user
in a queue (the most possible case), it's still a little costly.

I agree with the method which mentioned by Jason more: Specifically, we can create a map of
<user, headroom> for each queue, when we need update headroom, we can update the all
headroom in the map. And each SchedulerApplicationAttempt will hold a reference to headroom.
The "headroom" in the map maybe as same as the {{HeadroomProvider}} in your patch. I would
suggest to rename the {{HeadroomProvider}} to {{HeadroomReference}}, because we don't need
do any computation in it anymore.

Another benefit is, we don't need write HeadroomProvider for each scheduler. A simple HeadroomReference
with getter/setter should be enough.

Two more things we should take care with previous method:
1) As mentioned by Jason, currently, fair/capacity scheduler all support moving app between
queues, we should recompute and change the reference after finished moving app. 
2) In LeafQueue#assignContainers, we don't need call 
{code}
  Resource userLimit = 
      computeUserLimitAndSetHeadroom(application, clusterResource, 
          required);
{code}
For each application, and in LeafQueue#updateClusterResource iterate and update the map of
<user, headroom> should be enough

Wangda


> Capacity Scheduler headroom calculation does not work as expected
> -----------------------------------------------------------------
>
>                 Key: YARN-1198
>                 URL: https://issues.apache.org/jira/browse/YARN-1198
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Craig Welch
>         Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, YARN-1198.4.patch,
YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch
>
>
> Today headroom calculation (for the app) takes place only when
> * New node is added/removed from the cluster
> * New container is getting assigned to the application.
> However there are potentially lot of situations which are not considered for this calculation
> * If a container finishes then headroom for that application will change and should be
notified to the AM accordingly.
> * If a single user has submitted multiple applications (app1 and app2) to the same queue
then
> ** If app1's container finishes then not only app1's but also app2's AM should be notified
about the change in headroom.
> ** Similarly if a container is assigned to any applications app1/app2 then both AM should
be notified about their headroom.
> ** To simplify the whole communication process it is ideal to keep headroom per User
per LeafQueue so that everyone gets the same picture (apps belonging to same user and submitted
in same queue).
> * If a new user submits an application to the queue then all applications submitted by
all users in that queue should be notified of the headroom change.
> * Also today headroom is an absolute number ( I think it should be normalized but then
this is going to be not backward compatible..)
> * Also  when admin user refreshes queue headroom has to be updated.
> These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message