hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-10360) DataNode may format directory and lose blocks if If current/VERSION is missing
Date Tue, 03 May 2016 21:25:12 GMT
Wei-Chiu Chuang created HDFS-10360:
--------------------------------------

             Summary: DataNode may format directory and lose blocks if If current/VERSION
is missing
                 Key: HDFS-10360
                 URL: https://issues.apache.org/jira/browse/HDFS-10360
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode
            Reporter: Wei-Chiu Chuang
            Assignee: Wei-Chiu Chuang


Under certain circumstances, if the current/VERSION of a storage directory is missing, DataNode
may format the storage directory even though _block files are not missing_.

This is very easy to reproduce. Simply launch a HDFS cluster and create some files. Delete
current/VERSION, and restart the data node.

After the restart, the data node will format the directory and remove all existing block files:

{noformat}
2016-05-03 12:57:15,387 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/dfs/dn/in_use.lock
acquired by nodename 5314@weichiu-dn-2.vpc.cloudera.com
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory
/data/dfs/dn is not formatted for BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,389 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage
directories for bpid BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,464 INFO org.apache.hadoop.hdfs.server.common.Storage: Locking is disabled
for /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: Block pool storage
directory /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642 is not formatted for
BP-787466439-172
.26.24.43-1462305406642
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting ...
2016-05-03 12:57:15,465 INFO org.apache.hadoop.hdfs.server.common.Storage: Formatting block
pool BP-787466439-172.26.24.43-1462305406642 directory /data/dfs/dn/current/BP-787466439-172.26.24.43-1462305406642/current
{noformat}

The bug is: DataNode assumes that if none of {{current/VERSION}}, {{previous/}}, {{previous.tmp/}},
{{removed.tmp/}}, {{finalized.tmp/}} and {{lastcheckpoint.tmp/}} exists, the storage directory
contains nothing important to HDFS and decides to format it. However, block files may still
exist, and in my opinion, we should do everything possible to retain the block files.

I have two suggestions:
# check if {{current/}} directory is empty. If not, throw an InconsistentFSStateException
in {{Storage#analyzeStorage}} instead of asumming its not formatted. Or,
# In {{Storage#clearDirectory}}, before it formats the storage directory, rename or move {{current/}}
directory. Also, log whatever is being renamed/moved.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message