Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CBD6D1090D for ; Tue, 4 Jun 2013 21:57:21 +0000 (UTC) Received: (qmail 74632 invoked by uid 500); 4 Jun 2013 21:57:21 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 74608 invoked by uid 500); 4 Jun 2013 21:57:21 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 74599 invoked by uid 99); 4 Jun 2013 21:57:21 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 04 Jun 2013 21:57:21 +0000 Date: Tue, 4 Jun 2013 21:57:21 +0000 (UTC) From: "Thomas Graves (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-276) Capacity Scheduler can hang when submit many jobs concurrently MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13675315#comment-13675315 ] Thomas Graves commented on YARN-276: ------------------------------------ Thanks Nemon, I'm still reviewing it, here are a couple of things so far. I hope to finish reviewing later tonight. - LeafQueue - please wrap at 80 characters - LeafQueue - please use the @VisibleForTesting annoation in setMaxAMResourcePerQueuePerUserPercent - FicaSchedulerApp - for misspelled as foe - FicaSchedulerApp - please use the @VisibleForTesting annotation around setAMResource I ran a few tests and looked at the scheduler webui for the queue I was running in and the used resource and am used resources showed up blank even though there were jobs running. Can you please take a look to see why? The REST web services call were returning values for those fields. > Capacity Scheduler can hang when submit many jobs concurrently > -------------------------------------------------------------- > > Key: YARN-276 > URL: https://issues.apache.org/jira/browse/YARN-276 > Project: Hadoop YARN > Issue Type: Bug > Components: capacityscheduler > Affects Versions: 3.0.0, 2.0.1-alpha > Reporter: nemon lou > Assignee: nemon lou > Labels: incompatible > Attachments: YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch, YARN-276.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > In hadoop2.0.1,When i submit many jobs concurrently at the same time,Capacity scheduler can hang with most resources taken up by AM and don't have enough resources for tasks.And then all applications hang there. > The cause is that "yarn.scheduler.capacity.maximum-am-resource-percent" not check directly.Instead ,this property only used for maxActiveApplications. And maxActiveApplications is computed by minimumAllocation (not by Am actually used). -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira