Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43014489D for ; Tue, 5 Jul 2011 23:00:40 +0000 (UTC) Received: (qmail 30757 invoked by uid 500); 5 Jul 2011 23:00:39 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 30664 invoked by uid 500); 5 Jul 2011 23:00:39 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 30656 invoked by uid 99); 5 Jul 2011 23:00:38 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jul 2011 23:00:38 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 05 Jul 2011 23:00:37 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id DE1AB448CB for ; Tue, 5 Jul 2011 23:00:16 +0000 (UTC) Date: Tue, 5 Jul 2011 23:00:16 +0000 (UTC) From: "Matt Foley (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <156629838.2163.1309906816906.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (HDFS-1366) reduce namenode startup time by optimising checkBlockInfo while loading fsimage MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HDFS-1366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Foley updated HDFS-1366: ----------------------------- Attachment: FSImageRead_shortcut_proto.patch The code has changed since this ticket was opened. In March I did some experiments, and at that time there was no longer a BlocksMap.checkBlockInfo() method, and the call sequence was: {code} FSImage.loadFSImage() FSImageFormat.Loader.load() FSImageFormat.Loader.loadFullNameINodes() FSDirectory.addToParent() BlockManager.addINode() BlocksMap.addINode() {code} BlocksMap.addINode() did this: {code} BlockInfo addINode(BlockInfo b, INodeFile iNode) { BlockInfo info = blocks.get(b); if (info != b) { info = b; blocks.put(info); } info.setINode(iNode); return info; } {code} which could be replaced by {code} BlockInfo addINode(BlockInfo b, INodeFile iNode) { blocks.put(b); b.setINode(iNode); return b; } {code} Calling blocks.get() before conditionally calling blocks.put() in this way is a waste regardless of whether we are reading the FSImage or calling addINode() for any other purpose, because the cost of put and get are about the same, and the result of just calling put is identical to the above code. However, I put this into a simple proof-of-principle patch (attached - not ready for prime time) and tried it. I only got a 6% improvement in FSImage load time. > reduce namenode startup time by optimising checkBlockInfo while loading fsimage > --------------------------------------------------------------------------------- > > Key: HDFS-1366 > URL: https://issues.apache.org/jira/browse/HDFS-1366 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Reporter: dhruba borthakur > Attachments: FSImageRead_shortcut_proto.patch > > > The namenode spends about 10 minutes reading in a 14 GB fsimage file into memory and creating all the in-memory data structures. A jstack based debugger clearly shows that most of the time during the fsimage load is spent in BlocksMap.checkBlockInfo. There is a easy way to optimize this method especially for this code path. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira