hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel
Date Thu, 21 Oct 2010 00:29:25 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923258#action_12923258
] 

Konstantin Shvachko commented on HDFS-1071:
-------------------------------------------

It looks to me that with HDFS-903 in progress we should guarantee in this patch that all images
are identical. Otherwise MD5s of different images will not be the same. So my be my suggestion
of implementing this with one thread traversing the namespace tree and other threads writing
to the disk is more relevant now.

> savenamespace should write the fsimage to all configured fs.name.dir in parallel
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1071
>                 URL: https://issues.apache.org/jira/browse/HDFS-1071
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, HDFS-1071.5.patch,
HDFS-1071.6.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. When the namenode
restarts, it writes a copy of the fsimage to all directories configured in fs.name.dir. This
takes a long time, especially if there are many directories in fs.name.dir. Make the NN write
the fsimage to all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message