hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "girish vaitheeswaran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3248) Improve Namenode startup performance
Date Tue, 06 May 2008 01:47:55 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12594440#action_12594440

girish vaitheeswaran commented on HADOOP-3248:

>> You mentioned there were 20 million files. Could you mention how many blocks and
how many replicas there are per file? Also, the max heap size was 14GB. Did the >> namenode
use all of 14GB when it started?
>>Thanks for your response

I was using the default replication value of 3 for these experiments. Don't recall the memory
usage from the Namanode side. The -Xms parameter did help quite a bit in improving startup
time and after setting it to different values 14G turned out to be the most performant. As
for the number of blocks I am not sure but I beleive it was on an average of slightly less
than 2 per file. 

> Improve Namenode startup performance
> ------------------------------------
>                 Key: HADOOP-3248
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3248
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: dfs
>            Reporter: girish vaitheeswaran
>            Assignee: dhruba borthakur
>         Attachments: fastRestarts.patch, fastRestarts.patch, fastRestarts2.patch, fastRestarts3.patch,
fastRestarts3.patch, FSImage.patch
> One of the things that would need to be addressed as part of Namenode scalability is
the HDFS recovery performance especially in scenarios where the number of files is large.
There are instances where the number of files are in the vicinity of 20 million and in such
cases the time taken for namenode startup is prohibitive. Here are some benchmark numbers
on the time taken for namenode startup. These times do not include the time to process block
> Default scenario for 20 million files with the max  java heap size set to 14GB : 40 minutes
> Tuning various java options such as young size, parallel garbage collection, initial
java heap size : 14 minutes
> As can be seen, 14 minutes is still a long time for the namenode to recover and code
changes are required to bring this time down further. To this end some prototype optimizations
were done to reduce this time. Based on some timing analysis saveImage and loadFSImage where
the primary methods that were consuming most of the time. Most of the time was being spent
on doing object allocations. The goal of the optimizations is to reduce the number of memory
allocations as much as possible.
> Optimization 1: saveImage() 
> ======================
> Avoid allocation of the UTF8 object.
> Old code
> =======
> new UTF8(fullName).write(out);
> New Code
> ========
> out.writeUTF(fullName)
> Optimization 2: saveImage()
> ======================
> Avoid object allocation of the PermissionStatus Object and the FsPermission object. This
is to be done for Directories and for files.
> Old code
> =======
> fileINode.getPermissionStatus().write(out)
> New Code
> =========
> out.writeBytes(fileINode.getUserName())
> out.writeBytes(fileINode.getGroupName())
> out.writeShort(fileINode.getFsPermission().toShort())
> Optimization 3
> ============
> loadImage() could use the same mechanism where we would avoid allocating the PermissionStatus
object and the FsPermission object.
> Optimization 4
> ============
> A hack was tried out to avoid the cost of object allocation from saveImage() where the
fullName was being constructed using string concatenation. This optimization also helped improve
> Overall these optimizations helped bring down the overall startup time down to slightly
over 7 minutes. Most of all the remaining time is now spent in loadFSImage() since we allocate
the INode and INodeDirectory objects. Any further optimizations will need to focus on loadFSImage()

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message