hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps
Date Wed, 18 Jun 2014 18:46:25 GMT

    [ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036130#comment-14036130
] 

Sandy Ryza commented on YARN-2176:
----------------------------------

Without the ActivationCallback, the ActiveUsersManager would need to call in to the leaf queue,
which it currently doesn't even have a reference to.  It seems weirder to me to have an edge
from the ActiveUsersManager to the leaf queue than to have an edge from the AppSchedulingInfo
to the leaf queue - tracing what's going on would require more hops.  What do you think about
either
* Have both the ActiveUsersManager and the leaf queue register for the callback
* Have only the leaf queue register for the callback, and then be in charge of notifying the
ActiveUsersManager (which it already has a reference to) 

Sorry to be nitpicky on this pretty small thing - have just ended up confused by this code
multiple times and think it's worth getting right.

> CapacityScheduler loops over all running applications rather than actively requesting
apps
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-2176
>                 URL: https://issues.apache.org/jira/browse/YARN-2176
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacityscheduler
>    Affects Versions: 2.4.0
>            Reporter: Jason Lowe
>
> The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers,
and that currently loops over all applications that are running in the queue.  It would be
more efficient if we looped over just the applications that are actively asking for resources
rather than all applications, as there could be thousands of applications running but only
a few hundred that are currently asking for resources.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message