Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 39BA719B53 for ; Mon, 21 Mar 2016 01:12:34 +0000 (UTC) Received: (qmail 98625 invoked by uid 500); 21 Mar 2016 01:12:34 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 98576 invoked by uid 500); 21 Mar 2016 01:12:34 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 98560 invoked by uid 99); 21 Mar 2016 01:12:34 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Mar 2016 01:12:34 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 9A27F2C1F69 for ; Mon, 21 Mar 2016 01:12:33 +0000 (UTC) Date: Mon, 21 Mar 2016 01:12:33 +0000 (UTC) From: "Rohith Sharma K S (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4002) make ResourceTrackerService.nodeHeartbeat more concurrent MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203611#comment-15203611 ] Rohith Sharma K S commented on YARN-4002: ----------------------------------------- Thanks [~leftnoteasy] for the looking at the patch.. I was thought about adding these 2 places readlock, but after looking into caller of these 2 methods I felt it is not really required. # Method {{setDecomissionedNMsMetrics}} is called during service init, so this will be called during service initialization. # Method {{printConfiguredHosts }} is called during service init and refreshNodes. ## Once again, for service init, I do not think we need really acquire readlock. ## For refresh Node,{{printConfiguredHosts }} is with in the write lock, it is safe enough to go without read lock. As of now, without acquiring read lock would not cause any problem. In future, if any new method calling these methods need to think of acquiring read lock. > make ResourceTrackerService.nodeHeartbeat more concurrent > --------------------------------------------------------- > > Key: YARN-4002 > URL: https://issues.apache.org/jira/browse/YARN-4002 > Project: Hadoop YARN > Issue Type: Improvement > Reporter: Hong Zhiguo > Assignee: Hong Zhiguo > Priority: Critical > Attachments: 0001-YARN-4002.patch, YARN-4002-lockless-read.patch, YARN-4002-rwlock.patch, YARN-4002-v0.patch > > > We have multiple RPC threads to handle NodeHeartbeatRequest from NMs. By design the method ResourceTrackerService.nodeHeartbeat should be concurrent enough to scale for large clusters. > But we have a "BIG" lock in NodesListManager.isValidNode which I think it's unnecessary. > First, the fields "includes" and "excludes" of HostsFileReader are only updated on "refresh nodes". All RPC threads handling node heartbeats are only readers. So RWLock could be used to alow concurrent access by RPC threads. > Second, since he fields "includes" and "excludes" of HostsFileReader are always updated by "reference assignment", which is atomic in Java, the reader side lock could just be skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)