hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wei-Chiu Chuang (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HDFS-11022) DataNode unable to remove corrupt block replica due to race condition
Date Mon, 17 Oct 2016 22:18:58 GMT
Wei-Chiu Chuang created HDFS-11022:
--------------------------------------

             Summary: DataNode unable to remove corrupt block replica due to race condition
                 Key: HDFS-11022
                 URL: https://issues.apache.org/jira/browse/HDFS-11022
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: datanode, namenode
    Affects Versions: 2.6.0
         Environment: CDH5.7.0
            Reporter: Wei-Chiu Chuang
            Priority: Critical



Scenario:
# A client reads a replica blk_A_x from a data node and detected corruption.
# In the meantime, the replica is appended, updating its generation stamp from x to y.
# The client tells NN to mark the replica blk_A_x corrupt.
# NN tells the data node to (1) delete replica blk_A_x and (2) replicate the newer replica
blk_A_y from another datanode. Due to block placement policy, blk_A_y is replicated to the
same node. (It's a small cluster)
# DN is unable to receive the newer replica blk_A_y, because the replica already exists.
# DN is also unable to delete replica blk_A_y because blk_A_y does not exist.
# The replica on the DN is not part of data pipeline, so it becomes stale.

If another replica becomes corrupt and NameNode wants to replicate a healthy replica to this
DataNode, it can't, because a stale replica exists. Because this is a small cluster, soon
enough (in a matter of a hour) no DataNode is able to receive a healthy replica.

This cluster also suffers from HDFS-11019, so even though DataNode later detected data corruption,
it was unable to report to NameNode.

Note that we are still investigating the root cause of the corruption.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message