Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1D6E018B83 for ; Tue, 3 Nov 2015 21:01:30 +0000 (UTC) Received: (qmail 86872 invoked by uid 500); 3 Nov 2015 21:01:28 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 86773 invoked by uid 500); 3 Nov 2015 21:01:28 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 86433 invoked by uid 99); 3 Nov 2015 21:01:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Nov 2015 21:01:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id EC9102C1F7A for ; Tue, 3 Nov 2015 21:01:27 +0000 (UTC) Date: Tue, 3 Nov 2015 21:01:27 +0000 (UTC) From: "Wangda Tan (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988110#comment-14988110 ] Wangda Tan commented on YARN-3136: ---------------------------------- Backport and commit to branch-2.7. Ran all tests of resourcemanager before pushing. Attaching patch which is committed to branch-2.7. > getTransferredContainers can be a bottleneck during AM registration > ------------------------------------------------------------------- > > Key: YARN-3136 > URL: https://issues.apache.org/jira/browse/YARN-3136 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler > Affects Versions: 2.6.0 > Reporter: Jason Lowe > Assignee: Sunil G > Labels: 2.7.2-candidate > Fix For: 2.8.0, 2.7.2 > > Attachments: 0001-YARN-3136.patch, 00010-YARN-3136.patch, 00011-YARN-3136.patch, 00012-YARN-3136.patch, 00013-YARN-3136.patch, 0002-YARN-3136.patch, 0003-YARN-3136.patch, 0004-YARN-3136.patch, 0005-YARN-3136.patch, 0006-YARN-3136.patch, 0007-YARN-3136.patch, 0008-YARN-3136.patch, 0009-YARN-3136.patch, YARN-3136.branch-2.7.patch > > > While examining RM stack traces on a busy cluster I noticed a pattern of AMs stuck waiting for the scheduler lock trying to call getTransferredContainers. The scheduler lock is highly contended, especially on a large cluster with many nodes heartbeating, and it would be nice if we could find a way to eliminate the need to grab this lock during this call. We've already done similar work during AM allocate calls to make sure they don't needlessly grab the scheduler lock, and it would be good to do so here as well, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)