Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 284C59BCA for ; Wed, 7 Mar 2012 13:19:47 +0000 (UTC) Received: (qmail 46169 invoked by uid 500); 7 Mar 2012 13:19:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 46132 invoked by uid 500); 7 Mar 2012 13:19:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 45769 invoked by uid 99); 7 Mar 2012 13:19:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Mar 2012 13:19:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 07 Mar 2012 13:19:43 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id B7834DBC3 for ; Wed, 7 Mar 2012 13:19:03 +0000 (UTC) Date: Wed, 7 Mar 2012 13:19:03 +0000 (UTC) From: "Hudson (Commented) (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <1154470969.33689.1331126343753.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1574483887.14084.1319083810722.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-2477) Optimize computing the diff between a block report and the namenode state. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13224297#comment-13224297 ] Hudson commented on HDFS-2477: ------------------------------ Integrated in Hadoop-Mapreduce-0.23-Build #218 (See [https://builds.apache.org/job/Hadoop-Mapreduce-0.23-Build/218/]) HDFS-2477. Merging change r1197801 from trunk to 0.23 (Revision 1297866) HDFS-2477. Merging change r1196676 from trunk to 0.23 (Revision 1297861) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297866 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockInfo.java suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1297861 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockInfo.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java > Optimize computing the diff between a block report and the namenode state. > -------------------------------------------------------------------------- > > Key: HDFS-2477 > URL: https://issues.apache.org/jira/browse/HDFS-2477 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node > Affects Versions: 0.23.0 > Reporter: Tomasz Nykiel > Assignee: Tomasz Nykiel > Fix For: 0.24.0, 0.23.3 > > Attachments: reportDiff.patch, reportDiff.patch-2, reportDiff.patch-3, reportDiff.patch-4, reportDiff.patch-5 > > > When a block report is processed at the NN, the BlockManager.reportDiff traverses all blocks contained in the report, and for each one block, which is also present in the corresponding datanode descriptor, the block is moved to the head of the list of the blocks in this datanode descriptor. > With HDFS-395 the huge majority of the blocks in the report, are also present in the datanode descriptor, which means that almost every block in the report will have to be moved to the head of the list. > Currently this operation is performed by DatanodeDescriptor.moveBlockToHead, which removes a block from a list and then inserts it. In this process, we call findDatanode several times (afair 6 times for each moveBlockToHead call). findDatanode is relatively expensive, since it linearly goes through the triplets to locate the given datanode. > With this patch, we do some memoization of findDatanode, so we can reclaim 2 findDatanode calls. Our experiments show that this can improve the reportDiff (which is executed under write lock) by around 15%. Currently with HDFS-395, reportDiff is responsible for almost 100% of the block report processing time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira