Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 666D92A5B for ; Thu, 21 Apr 2011 08:31:46 +0000 (UTC) Received: (qmail 25292 invoked by uid 500); 21 Apr 2011 08:31:46 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 25194 invoked by uid 500); 21 Apr 2011 08:31:46 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 25184 invoked by uid 99); 21 Apr 2011 08:31:46 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2011 08:31:46 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Apr 2011 08:31:43 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C3D73AC56E for ; Thu, 21 Apr 2011 08:31:05 +0000 (UTC) Date: Thu, 21 Apr 2011 08:31:05 +0000 (UTC) From: "dhruba borthakur (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <347577139.73103.1303374665798.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-1295) Improve namenode restart times by short-circuiting the first block reports from datanodes MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13022685#comment-13022685 ] dhruba borthakur commented on HDFS-1295: ---------------------------------------- hi suresh, the block-report creation time is included in the metric. > Improve namenode restart times by short-circuiting the first block reports from datanodes > ----------------------------------------------------------------------------------------- > > Key: HDFS-1295 > URL: https://issues.apache.org/jira/browse/HDFS-1295 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 0.22.0 > Reporter: dhruba borthakur > Assignee: Matt Foley > Fix For: 0.23.0 > > Attachments: IBR_shortcut_v2a.patch, IBR_shortcut_v3atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v4atrunk.patch, IBR_shortcut_v6atrunk.patch, IBR_shortcut_v7atrunk.patch, shortCircuitBlockReport_1.txt > > > The namenode restart is dominated by the performance of processing block reports. On a 2000 node cluster with 90 million blocks, block report processing takes 30 to 40 minutes. The namenode "diffs" the contents of the incoming block report with the contents of the blocks map, and then applies these diffs to the blocksMap, but in reality there is no need to compute the "diff" because this is the first block report from the datanode. > This code change improves block report processing time by 300%. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira