hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Craig Welch (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected
Date Wed, 13 Aug 2014 01:12:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094997#comment-14094997
] 

Craig Welch commented on YARN-1198:
-----------------------------------

So, looking at this a bit more holistically - it appears to me that the cumulative effect
of the changes in this jira and it's subtasks is that any change in utilization by any application
in the queue potentially effects the headroom of all of the applications in the queue (really,
any change anywhere in the cluster when you consider [YARN-2008], but putting that aside for
the moment...) - the current approach (.4 patch) may do the trick, but I wonder if it wouldn't
be better to tweak things a bit in the following way:

given that:
an application's headroom is effectively a user's headroom for the application's queue (the
user in queue headroom)
and
the user in queue headroom is effectively a generic per user headroom in the queue (an identical
slicing for all users based on how many are active combined with the user limit factor) minus
what that user is already using across all applications (already tracked in User)
and
any change which impacts this does cause a headroom recalculation for an application in the
queue, but may affect them all

when recalculating headroom on any event we could generate one generic queue-user value and
then iterate all the applications in the queue and adjust their headroom to a per user value
which would simply be the generic queue-per-user headroom minus that user's used resources

Which is to say, I think that any time we recalculate the headroom we want to recalculate
it for all users in the queue and apply the change to all applications in the queue - and
I believe the simplest and most efficient way to do that would be to generate a generic "queue
headroom", apply the generic per user logic, then iterate the applications and set the application
user's headroom (same for all of that user's applications - calculated once per user - the
generic value minus that user's used resources)

> Capacity Scheduler headroom calculation does not work as expected
> -----------------------------------------------------------------
>
>                 Key: YARN-1198
>                 URL: https://issues.apache.org/jira/browse/YARN-1198
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Omkar Vinit Joshi
>            Assignee: Craig Welch
>         Attachments: YARN-1198.1.patch, YARN-1198.2.patch, YARN-1198.3.patch, YARN-1198.4.patch
>
>
> Today headroom calculation (for the app) takes place only when
> * New node is added/removed from the cluster
> * New container is getting assigned to the application.
> However there are potentially lot of situations which are not considered for this calculation
> * If a container finishes then headroom for that application will change and should be
notified to the AM accordingly.
> * If a single user has submitted multiple applications (app1 and app2) to the same queue
then
> ** If app1's container finishes then not only app1's but also app2's AM should be notified
about the change in headroom.
> ** Similarly if a container is assigned to any applications app1/app2 then both AM should
be notified about their headroom.
> ** To simplify the whole communication process it is ideal to keep headroom per User
per LeafQueue so that everyone gets the same picture (apps belonging to same user and submitted
in same queue).
> * If a new user submits an application to the queue then all applications submitted by
all users in that queue should be notified of the headroom change.
> * Also today headroom is an absolute number ( I think it should be normalized but then
this is going to be not backward compatible..)
> * Also  when admin user refreshes queue headroom has to be updated.
> These all are the potential bugs in headroom calculations



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message