hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Foley (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart
Date Fri, 13 May 2011 21:50:47 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13033329#comment-13033329
] 

Matt Foley commented on HDFS-1921:
----------------------------------

None of the test errors are related to this patch (all four are recurring; see HDFS-1852).
I agree with Aaron that his new unit test for HDFS-1505 is a good test for this patch too,
so no additional unit tests needed (but the core of that unit test is attached to this Jira,
and passes local testing).

> Save namespace can cause NN to be unable to come up on restart
> --------------------------------------------------------------
>
>                 Key: HDFS-1921
>                 URL: https://issues.apache.org/jira/browse/HDFS-1921
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.22.0, 0.23.0
>            Reporter: Aaron T. Myers
>            Assignee: Matt Foley
>            Priority: Blocker
>             Fix For: 0.22.0, 0.23.0
>
>         Attachments: hdfs-1505-1-test.txt, hdfs1921_v23.patch, hdfs1921_v23.patch
>
>
> I discovered this in the course of trying to implement a fix for HDFS-1505.
> Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save namespace
proceeds in the following order:
> # rename current to lastcheckpoint.tmp for all of them,
> # save image and recreate edits for all of them,
> # rename lastcheckpoint.tmp to previous.checkpoint.
> The problem is that step 3 occurs regardless of whether or not an error occurs for all
storage directories in step 2. Upon restart, the NN will see non-existent or corrupt {{current}}
directories, and no {{lastcheckpoint.tmp}} directories, and so will conclude that the storage
directories are not formatted.
> This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23
blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message