hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-4015) Safemode should count and report orphaned blocks
Date Sat, 06 Oct 2012 10:47:02 GMT
Todd Lipcon created HDFS-4015:
---------------------------------

             Summary: Safemode should count and report orphaned blocks
                 Key: HDFS-4015
                 URL: https://issues.apache.org/jira/browse/HDFS-4015
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: name-node
    Affects Versions: 3.0.0
            Reporter: Todd Lipcon


The safemode status currently reports the number of unique reported blocks compared to the
total number of blocks referenced by the namespace. However, it does not report the inverse:
blocks which are reported by datanodes but not referenced by the namespace.

In the case that an admin accidentally starts up from an old image, this can be confusing:
safemode and fsck will show "corrupt files", which are the files which actually have been
deleted but got resurrected by restarting from the old image. This will convince them that
they can safely force leave safemode and remove these files -- after all, they know that those
files should really have been deleted. However, they're not aware that leaving safemode will
also unrecoverably delete a bunch of other block files which have been orphaned due to the
namespace rollback.

I'd like to consider reporting something like: "900000 of expected 1000000 blocks have been
reported. Additionally, 10000 blocks have been reported which do not correspond to any file
in the namespace. Forcing exit of safemode will unrecoverably remove those data blocks"

Whether this statistic is also used for some kind of "inverse safe mode" is the logical next
step, but just reporting it as a warning seems easy enough to accomplish and worth doing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message