Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 29093 invoked from network); 8 Jul 2010 01:33:48 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Jul 2010 01:33:48 -0000 Received: (qmail 63168 invoked by uid 500); 8 Jul 2010 01:33:48 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 63083 invoked by uid 500); 8 Jul 2010 01:33:47 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 63075 invoked by uid 99); 8 Jul 2010 01:33:47 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jul 2010 01:33:47 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jul 2010 01:33:44 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o681PqvZ012720 for ; Thu, 8 Jul 2010 01:25:52 GMT Message-ID: <28491065.251561278552352175.JavaMail.jira@thor> Date: Wed, 7 Jul 2010 21:25:52 -0400 (EDT) From: "Todd Lipcon (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-1140) Speedup INode.getPathComponents In-Reply-To: <22042367.5701273528950268.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12886177#action_12886177 ] Todd Lipcon commented on HDFS-1140: ----------------------------------- Hmm, TestFileAppend4 passes for me on a trunk checkout. It seems like some test that ran prior to it didn't close resources properly? Does it fail on your machine? > Speedup INode.getPathComponents > ------------------------------- > > Key: HDFS-1140 > URL: https://issues.apache.org/jira/browse/HDFS-1140 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node > Affects Versions: 0.22.0 > Reporter: Dmytro Molkov > Assignee: Dmytro Molkov > Priority: Minor > Fix For: 0.22.0 > > Attachments: HDFS-1140.2.patch, HDFS-1140.3.patch, HDFS-1140.4.patch, HDFS-1140.patch > > > When the namenode is loading the image there is a significant amount of time being spent in the DFSUtil.string2Bytes. We have a very specific workload here. The path that namenode does getPathComponents for shares N - 1 component with the previous path this method was called for (assuming current path has N components). > Hence we can improve the image load time by caching the result of previous conversion. > We thought of using some simple LRU cache for components, but the reality is, String.getBytes gets optimized during runtime and LRU cache doesn't perform as well, however using just the latest path components and their translation to bytes in two arrays gives quite a performance boost. > I could get another 20% off of the time to load the image on our cluster (30 seconds vs 24) and I wrote a simple benchmark that tests performance with and without caching. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.