hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-1780) reduce need to rewrite fsimage on statrtup
Date Wed, 23 Mar 2011 18:44:05 GMT
reduce need to rewrite fsimage on statrtup

                 Key: HDFS-1780
                 URL: https://issues.apache.org/jira/browse/HDFS-1780
             Project: Hadoop HDFS
          Issue Type: New Feature
            Reporter: Daryn Sharp

On startup, the namenode will read the fs image, apply edits, then rewrite the fs image. 
This requires a non-trivial amount of time for very large directory structures.  Perhaps the
namenode should employ some logic to decide that the edits are simple enough that it doesn't
warrant rewriting the image back out to disk.

A few ideas:
Use the size of the edit logs, if the size is below a threshold, assume it's cheaper to reprocess
the edit log instead of writing the image back out.

Time the processing of the edits and if the time is below a defined threshold, the image isn't

Timing the reading of the image, and the processing of the edits.  Base the decision on the
time it would take to write the image (a multiplier is applied to the read time?) versus the
time it would take to reprocess the edits.  If a certain threshold (perhaps percentage or
expected time to rewrite) is exceeded, rewrite the image.

Somethingalong the lines of the last suggestion may allow for defaults that adapt for any
size cluster, thus eliminating the need to keep tweaking a cluster's settings based on its

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message