Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A3FB5200B36 for ; Wed, 1 Jun 2016 06:24:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A2B34160A47; Wed, 1 Jun 2016 04:24:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id F3710160A44 for ; Wed, 1 Jun 2016 06:24:14 +0200 (CEST) Received: (qmail 13635 invoked by uid 500); 1 Jun 2016 04:24:13 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 13147 invoked by uid 99); 1 Jun 2016 04:24:13 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Jun 2016 04:24:13 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D59122C1F61 for ; Wed, 1 Jun 2016 04:24:12 +0000 (UTC) Date: Wed, 1 Jun 2016 04:24:12 +0000 (UTC) From: "ChenFolin (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (YARN-5188) FairScheduler performance bug MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Wed, 01 Jun 2016 04:24:15 -0000 ChenFolin created YARN-5188: ------------------------------- Summary: FairScheduler performance bug Key: YARN-5188 URL: https://issues.apache.org/jira/browse/YARN-5188 Project: Hadoop YARN Issue Type: Bug Components: fairscheduler Affects Versions: 2.5.0, 2.8.0 Reporter: ChenFolin My Hadoop Cluster has recently encountered a performance problem. Details as Follows. The efficiency of assign container in the Resourcemanager may fall when the number of running and pending application grows. And the fact is the cluster has too many PendingMB or PengdingVcore , and the Cluster current utilization rate may below 20%. I checked the resourcemanager logs, I found that every assign container may cost 5 ~ 10 ms, but just 0 ~ 1 ms at usual time. I use TestFairScheduler to reproduce the scene: Just one queue: root.defalut 10240 apps. assign container avg time: 6753.9 us ( 6.7539 ms) apps sort time (FSLeafQueue : Collections.sort(runnableApps, comparator); ): 4657.01 us ( 4.657 ms ) compute LeafQueue Resource usage : 905.171 us ( 0.905171 ms ) When just root.default, one assign container op contains : ( one apps sort op ) + 2 * ( compute leafqueue usage op ) According to the above situation, I think the assign container op has a performance problem . -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org