Return-Path: X-Original-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 303DB105E5 for ; Fri, 12 Dec 2014 20:09:14 +0000 (UTC) Received: (qmail 91539 invoked by uid 500); 12 Dec 2014 20:09:13 -0000 Delivered-To: apmail-hadoop-hdfs-dev-archive@hadoop.apache.org Received: (qmail 91436 invoked by uid 500); 12 Dec 2014 20:09:13 -0000 Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-dev@hadoop.apache.org Delivered-To: mailing list hdfs-dev@hadoop.apache.org Received: (qmail 91423 invoked by uid 99); 12 Dec 2014 20:09:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2014 20:09:13 +0000 Date: Fri, 12 Dec 2014 20:09:13 +0000 (UTC) From: "Ming Ma (JIRA)" To: hdfs-dev@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (HDFS-7518) Heartbeat processing doesn't have to take FSN readLock MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Ming Ma created HDFS-7518: ----------------------------- Summary: Heartbeat processing doesn't have to take FSN readLock Key: HDFS-7518 URL: https://issues.apache.org/jira/browse/HDFS-7518 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Ming Ma NameNode takes global read lock when it process heartbeat RPCs from DataNodes. This increases lock contention and could impact NN overall throughput. Given Heartbeat processing needs to access data specific to the DataNode that invokes the RPC; it could just synchronize on the specific DataNode and datanodeMap. It looks like each DatanodeDescriptor already keeps its own recover blocks, replication blocks and invalidate blocks. There are several places that needed to be changed to remove FSN lock. As mentioned in other jiras, we need to some mechanism to reason about the correctness of the solution. Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)