Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 477081100E for ; Wed, 18 Jun 2014 18:46:25 +0000 (UTC) Received: (qmail 25558 invoked by uid 500); 18 Jun 2014 18:46:25 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 25515 invoked by uid 500); 18 Jun 2014 18:46:25 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 25496 invoked by uid 99); 18 Jun 2014 18:46:25 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 18 Jun 2014 18:46:25 +0000 Date: Wed, 18 Jun 2014 18:46:25 +0000 (UTC) From: "Sandy Ryza (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2176) CapacityScheduler loops over all running applications rather than actively requesting apps MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14036130#comment-14036130 ] Sandy Ryza commented on YARN-2176: ---------------------------------- Without the ActivationCallback, the ActiveUsersManager would need to call in to the leaf queue, which it currently doesn't even have a reference to. It seems weirder to me to have an edge from the ActiveUsersManager to the leaf queue than to have an edge from the AppSchedulingInfo to the leaf queue - tracing what's going on would require more hops. What do you think about either * Have both the ActiveUsersManager and the leaf queue register for the callback * Have only the leaf queue register for the callback, and then be in charge of notifying the ActiveUsersManager (which it already has a reference to) Sorry to be nitpicky on this pretty small thing - have just ended up confused by this code multiple times and think it's worth getting right. > CapacityScheduler loops over all running applications rather than actively requesting apps > ------------------------------------------------------------------------------------------ > > Key: YARN-2176 > URL: https://issues.apache.org/jira/browse/YARN-2176 > Project: Hadoop YARN > Issue Type: Improvement > Components: capacityscheduler > Affects Versions: 2.4.0 > Reporter: Jason Lowe > > The capacity scheduler performance is primarily dominated by LeafQueue.assignContainers, and that currently loops over all applications that are running in the queue. It would be more efficient if we looped over just the applications that are actively asking for resources rather than all applications, as there could be thousands of applications running but only a few hundred that are currently asking for resources. -- This message was sent by Atlassian JIRA (v6.2#6252)