hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1793) Add code to inspect a storage directory with txid-based filenames
Date Sun, 08 May 2011 23:59:03 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13030561#comment-13030561

Todd Lipcon commented on HDFS-1793:

bq. Shouldn't an inprogress file in the middle of sequence of logs be an error, looks like
it's not based on the test.

Currently, it's OK after the case of multiple NN crashes, since we don't rename the inprogress
logs after restarting the NN. For example:
* NN writing to edits_1_inprogress
* NN crashes after writing 10 transactions
* NN restarts, and starts writing to edits_11_inprogress
* NN rolls, renaming edits_11_inprogress to edits_11-20 (for example)

Here the edits_1_inprogress is still valid, unfortunately.

The solution would be to have the NN rename any inprogress log after it's been read, based
on the number of transactions in it. Do you think this is necessary?


I'm bumping this each time I do a merge from trunk.

bq. The log in FSImage#inspectStorageDirs for a directory with no VERSION file should perhaps
be a warn instead of info.
bq. newline need before declaration of FoundFSImage

bq. Wrt TODO by getLatestImage can be annotated as test-only
Actually this turned out to be used, so I removed this TODO, thanks.

(Since this was a post-commit review, I just committed the above changes to the branch, rather
than uploading a new patch)

> Add code to inspect a storage directory with txid-based filenames
> -----------------------------------------------------------------
>                 Key: HDFS-1793
>                 URL: https://issues.apache.org/jira/browse/HDFS-1793
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>         Attachments: HDFS-1793.txt, hdfs-1793.txt
> After HDFS-1538, the startup of the NN is done with the following phases:
> - Inspect storage directories
> - Formulate a "load plan", where a load plan is:
> -- A set of recovery actions (eg deleting half-uploaded checkpoint uplods)
> -- The fsimage to restore
> -- A sequence of edit logs to replay
> - Run the recovery
> - Load specified image
> - Load specified edits.
> This JIRA adds code to inspect a set of storage directories where the image and edits
files are named according to the scheme designed in HDFS-1073, and formulate such a load plan.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message