hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1780) reduce need to rewrite fsimage on statrtup
Date Thu, 24 Mar 2011 16:46:05 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010757#comment-13010757
] 

Daryn Sharp commented on HDFS-1780:
-----------------------------------

The high level problem is brainstorming how we might short out costly & unnecessary work.
 After Matt Foley's NN startup improvements are complete, it appears the image rewrite will
become non-trivial.  There's a suggestion to background the rewrite, but it's worth considering
if/when the rewrite might be avoided entirely.

I really like Todd's suggestion, although I think the NN would have to know whether it has
a reliable and functional 2NN?  I'm still coming up to speed on this project, so please forgive
any (seemingly obvious) misunderstandings on my part.

> reduce need to rewrite fsimage on statrtup
> ------------------------------------------
>
>                 Key: HDFS-1780
>                 URL: https://issues.apache.org/jira/browse/HDFS-1780
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Daryn Sharp
>
> On startup, the namenode will read the fs image, apply edits, then rewrite the fs image.
 This requires a non-trivial amount of time for very large directory structures.  Perhaps
the namenode should employ some logic to decide that the edits are simple enough that it doesn't
warrant rewriting the image back out to disk.
> A few ideas:
> Use the size of the edit logs, if the size is below a threshold, assume it's cheaper
to reprocess the edit log instead of writing the image back out.
> Time the processing of the edits and if the time is below a defined threshold, the image
isn't rewritten.
> Timing the reading of the image, and the processing of the edits.  Base the decision
on the time it would take to write the image (a multiplier is applied to the read time?) versus
the time it would take to reprocess the edits.  If a certain threshold (perhaps percentage
or expected time to rewrite) is exceeded, rewrite the image.
> Somethingalong the lines of the last suggestion may allow for defaults that adapt for
any size cluster, thus eliminating the need to keep tweaking a cluster's settings based on
its size.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message