hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fengdong Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6154) Improve the speed of saveNameSpace´╝îmaking HDFS restart and checkPoint faster
Date Tue, 25 Mar 2014 07:44:53 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946260#comment-13946260
] 

Fengdong Yu commented on HDFS-6154:
-----------------------------------

hi [~guodongdong]

please change your LANG to en_US.utf-8 before generate the patch.
I have sevral comments:

{code}
-      DigestOutputStream fos = new DigestOutputStream(fout, digester);
+      java.io.OutputStream fos = new DigestOutputStream(fout, digester);
+      fos = new AsyncBufferedOutputStream(fos);
{code}

it could be 
{code}
 java.io.OutputStream fos = new AsyncBufferedOutputStream(
       new DigestOutputStream(fout, digester));
{code}

{code}
-        loadSecretManagerState(in);
+        //loadSecretManagerState(in);
{code}

why comment out load secret manager?


I'll continue review the patch later.


> Improve the speed of saveNameSpace´╝îmaking HDFS restart and checkPoint faster
> ----------------------------------------------------------------------------
>
>                 Key: HDFS-6154
>                 URL: https://issues.apache.org/jira/browse/HDFS-6154
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>    Affects Versions: 2.3.0
>            Reporter: guodongdong
>         Attachments: HDFS-6154-patch
>
>
> There are two stage In namenode savenamespace,  serializing INode, calculate MD5 and
write to disk.  Now, two stage is doing serially, In this improvement, it is doing  parallel,
one thread do serializing INode, other thread do calculating MD5 and writing to disk, it double
speed of savenamespace, Detail is show in table:
> Testing environment:
>   only test namenode savenamespace, dfsadmin -saveNamespace
>     machine: 144GB, Intel(R) Xeon(R) CPU  E5645  @ 2.40GHz, 12 cpu, Raid 5 SAS Disk,
 jdk 1.7.0
>  
> ||image size||before optimizing||after optimizing ||
> |1.2GB|22sec|11sec|
> |4.3GB|66sec|36sec|
> |22GB|406sec|250sec|



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message