hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2749) Wrong fsimage format while entering recovery mode
Date Wed, 04 Jan 2012 16:33:40 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179614#comment-13179614
] 

Todd Lipcon commented on HDFS-2749:
-----------------------------------

Good find! I think this is probably the cause of HDFS-1029 which I saw in 2010. Do you have
a unit test and/or patch for the problem?
                
> Wrong fsimage format while entering recovery mode
> -------------------------------------------------
>
>                 Key: HDFS-2749
>                 URL: https://issues.apache.org/jira/browse/HDFS-2749
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.2
>            Reporter: Denny Ye
>            Priority: Critical
>              Labels: hdfs
>
> hadoop is into a recovery mode and save namespace to disk before the system starting
service. however, there are many situation will cause hadoop enter recovery mode like missing
VERSION file and ckpt file exists due to last failure of checkpoint.
> in recovery mode, namespace is loaded from previous fsimage, and the default numFiles
of namespace.rootDir is 1. the numFiles number is read from fsimage (readInt as version, readInt
as namespaceId, readLong as numFiles).
> the numFiles number is not updated in namespace when saving namespace.
> save namespace just after load fsimage which actually write numFiles which is default
value 1 to disk.
> the next time to load the saved fsimage from disk when rebooting or secondarynamenode
doing checkpoint, the system will crash (OOM) because this fsimage is incorrect.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message