Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3D22110B28 for ; Mon, 9 Feb 2015 16:17:35 +0000 (UTC) Received: (qmail 67106 invoked by uid 500); 9 Feb 2015 16:17:35 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 67058 invoked by uid 500); 9 Feb 2015 16:17:35 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 67045 invoked by uid 99); 9 Feb 2015 16:17:35 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Feb 2015 16:17:35 +0000 Date: Mon, 9 Feb 2015 16:17:34 +0000 (UTC) From: "Sunil G (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-3136) getTransferredContainers can be a bottleneck during AM registration MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunil G updated YARN-3136: -------------------------- Attachment: 0002-YARN-3136.patch Hi [~jlowe] [~jianhe] *applications* map is made and ConcurrentMap and can thus enforce concurrency. However as mentioned in previous comments, this can cause issues for existing custom schedulers which doesnt use ConcurrentMap. Pls share your comments. > getTransferredContainers can be a bottleneck during AM registration > ------------------------------------------------------------------- > > Key: YARN-3136 > URL: https://issues.apache.org/jira/browse/YARN-3136 > Project: Hadoop YARN > Issue Type: Sub-task > Components: scheduler > Affects Versions: 2.6.0 > Reporter: Jason Lowe > Assignee: Sunil G > Attachments: 0001-YARN-3136.patch, 0002-YARN-3136.patch > > > While examining RM stack traces on a busy cluster I noticed a pattern of AMs stuck waiting for the scheduler lock trying to call getTransferredContainers. The scheduler lock is highly contended, especially on a large cluster with many nodes heartbeating, and it would be nice if we could find a way to eliminate the need to grab this lock during this call. We've already done similar work during AM allocate calls to make sure they don't needlessly grab the scheduler lock, and it would be good to do so here as well, if possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)