hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-900) Corrupt replicas are not tracked correctly through block report from DN
Date Thu, 03 Feb 2011 09:50:28 GMT

     [ https://issues.apache.org/jira/browse/HDFS-900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Konstantin Shvachko updated HDFS-900:

    Attachment: reportCorruptBlock.patch

Yes, this is indeed a bug in block report. After step 3 in Todd's description the NN has 3
good replicas and one corrupt. The corrupt replica is in recentInvalidatesSet, but not in
the DatanodeDescriptor. That is the replica is scheduled for deletion from the DN. See blockReceived().

But before it is deleted from the DN, that same DN sends a block report, which contains the
replica. DatanodeDescriptor.processReport() treats it as a new replica because it is not in
the DatanodeDescriptor and a good one since its blockId, generationStamp, and the length are
in order.
The fix is to ignore replicas that are scheduled for deletion from this DN.
I tested this patch with the test case attached by Todd, thanks. The test passes with the
fix and fails without.
The test case is not exactly a unit test as it introduces changes to FSNamesystem class for
testing. So I did not include it to the patch.
Todd, is it possible to convert your case into a real unit test.

> Corrupt replicas are not tracked correctly through block report from DN
> -----------------------------------------------------------------------
>                 Key: HDFS-900
>                 URL: https://issues.apache.org/jira/browse/HDFS-900
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>             Fix For: 0.22.0
>         Attachments: log-commented, reportCorruptBlock.patch, to-reproduce.patch
> This one is tough to describe, but essentially the following order of events is seen
to occur:
> # A client marks one replica of a block to be corrupt by telling the NN about it
> # Replication is then scheduled to make a new replica of this node
> # The replication completes, such that there are now 3 good replicas and 1 corrupt replica
> # The DN holding the corrupt replica sends a block report. Rather than telling this DN
to delete the node, the NN instead marks this as a new *good* replica of the block, and schedules
deletion on one of the good replicas.
> I don't know if this is a dataloss bug in the case of 1 corrupt replica with dfs.replication=2,
but it seems feasible. I will attach a debug log with some commentary marked by '============>',
plus a unit test patch which I can get to reproduce this behavior reliably. (it's not a proper
unit test, just some edits to an existing one to show it)

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message