hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1070) Speedup NameNode image loading and saving by storing local file names
Date Thu, 31 Mar 2011 19:20:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014145#comment-13014145
] 

Matt Foley commented on HDFS-1070:
----------------------------------

Hi Hairong, nice clean patch, thanks for this improvement.  Couple minor comments:

1. The original code had an implicit consistency check between numFiles and the number of
files actually read in by load().  Can you add a check after the call to loadLocalNameINodes(in)
to assure that namesystem.dir.rootDir.numItemsInTree() == numFiles?

2. Since typically #files >> #directories, I suggest that the last loop in saveINodes()
be changed to avoid unnecessary recursion on every file inode:
{code}
//from 
        for(INode child : children) {
          saveINodes(child, out);
        }
//to
        for(INode child : children) {
          if (child.isDirectory())
            saveINodes(child, out);
          else
            FSImageSerialization.saveINode2Image(child, out);
        }
{code}

> Speedup NameNode image loading and saving by storing local file names
> ---------------------------------------------------------------------
>
>                 Key: HDFS-1070
>                 URL: https://issues.apache.org/jira/browse/HDFS-1070
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: Hairong Kuang
>            Assignee: Hairong Kuang
>         Attachments: trunkLocalNameImage.patch, trunkLocalNameImage1.patch, trunkLocalNameImage3.patch,
trunkLocalNameImage4.patch, trunkLocalNameImage5.patch
>
>
> Currently each inode stores its full path in the fsimage. I'd propose to store the local
name instead. In order for each inode to identify its parent, all inodes in a directory tree
are stored in the image in in-order. This proposal also requires each directory stores the
number of its children in image.
> This proposal would bring a few benefits as pointed below and therefore speedup the image
loading and saving.
> # Remove the overhead of converting java-UTF8 encoded local name to string-represented
full path then to UTF8 encoded full path when saving to an image and vice versa when loading
the image.
> # Remove the overhead of traversing the full path when inserting the inode to its parent
inode.
> # Reduce the number of temporary java objects during the process of image loading or
saving and  therefore reduce the GC overhead.
> # Reduce the size of an image.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message