hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron T. Myers (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-1921) Save namespace can cause NN to be unable to come up on restart
Date Wed, 11 May 2011 23:19:47 GMT
Save namespace can cause NN to be unable to come up on restart
--------------------------------------------------------------

                 Key: HDFS-1921
                 URL: https://issues.apache.org/jira/browse/HDFS-1921
             Project: Hadoop HDFS
          Issue Type: Bug
    Affects Versions: 0.22.0, 0.23.0
            Reporter: Aaron T. Myers
            Priority: Critical
             Fix For: 0.22.0, 0.23.0


I discovered this in the course of trying to implement a fix for HDFS-1505.

Per the comment for {{FSImage.saveNamespace(...)}}, the algorithm for save namespace proceeds
in the following order:

# rename current to lastcheckpoint.tmp for all of them,
# save image and recreate edits for all of them,
# rename lastcheckpoint.tmp to previous.checkpoint.

The problem is that step 3 occurs regardless of whether or not an error occurs for all storage
directories in step 2. Upon restart, the NN will see non-existent or corrupt {{current}} directories,
and no {{lastcheckpoint.tmp}} directories, and so will conclude that the storage directories
are not formatted.

This issue appears to be present on both 0.22 and 0.23. This should arguably be a 0.22/0.23
blocker.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message