hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wangda Tan (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected
Date Mon, 22 Sep 2014 15:21:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14143301#comment-14143301

Wangda Tan commented on YARN-1198:

Hi [~cwelch],
Sorry for this late response, I've just looked your ver.8 patch and comments,
My reply,
bq. -re "we don't need write HeadroomProvider for each scheduler" 
bq. Provider vs Reference
I agree with this, I think we need write different Headroom Provider and it's better to keep
Provider since its more general.

bq. -re "As mentioned by Jason, currently ...
Agree, this can be done in a separated JIRA

bq. -re the cost of the calculation
Agree, it's just a small computation effort.

In the past, I suggest do as I mentioned https://issues.apache.org/jira/browse/YARN-1198?focusedCommentId=14108991&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14108991
because I think that will make code more clean.
But according to your ver.8 patch, I realized that may not doable. In LeafQueue#computeUserLimit,
it uses required to get user limit. In your patch, you save the lastRequired to user class.
However, we need different required for different app under a same user. We can only do the
calculate when app heartbeats (We can also loop and set all app's headroom, but that's a way
we abandoned before). 

So basically, IMHO, I think your ver.7 is a more correct way to go. Which keeps complexity/efficiency
Any thoughts? [~jianhe], [~cwelch].


> Capacity Scheduler headroom calculation does not work as expected
> -----------------------------------------------------------------
>                 Key: YARN-1198
>                 URL: https://issues.apache.org/jira/browse/YARN-1198
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Craig Welch
>         Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, YARN-1198.4.patch,
YARN-1198.5.patch, YARN-1198.6.patch, YARN-1198.7.patch, YARN-1198.8.patch
> Today headroom calculation (for the app) takes place only when
> * New node is added/removed from the cluster
> * New container is getting assigned to the application.
> However there are potentially lot of situations which are not considered for this calculation
> * If a container finishes then headroom for that application will change and should be
notified to the AM accordingly.
> * If a single user has submitted multiple applications (app1 and app2) to the same queue
> ** If app1's container finishes then not only app1's but also app2's AM should be notified
about the change in headroom.
> ** Similarly if a container is assigned to any applications app1/app2 then both AM should
be notified about their headroom.
> ** To simplify the whole communication process it is ideal to keep headroom per User
per LeafQueue so that everyone gets the same picture (apps belonging to same user and submitted
in same queue).
> * If a new user submits an application to the queue then all applications submitted by
all users in that queue should be notified of the headroom change.
> * Also today headroom is an absolute number ( I think it should be normalized but then
this is going to be not backward compatible..)
> * Also  when admin user refreshes queue headroom has to be updated.
> These all are the potential bugs in headroom calculations

This message was sent by Atlassian JIRA

View raw message