hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1103) Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk
Date Fri, 14 May 2010 18:02:48 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867606#action_12867606
] 

Hairong Kuang commented on HDFS-1103:
-------------------------------------

In the 0.21 append design, if every BlockReceiver could make sure that its buffered packet
gets flushed to the disk before it exits on error, then I do not think the problem you described
will happen. Probably the code does not enforce it now. I do not think that we should use
the max of RBWs. In 0.21, there is no concept of validate length for RBWs. 

> Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed
last chunk
> ------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1103
>                 URL: https://issues.apache.org/jira/browse/HDFS-1103
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-1103-test.txt
>
>
> When the DN creates a replica under recovery, it calls validateIntegrity, which truncates
the last checksum chunk off of a replica if it is found to be invalid. Then when the block
recovery process happens, this shortened block wins over a longer replica from another node
where there was no corruption. Thus, if just one of the DNs has an invalid last checksum chunk,
data that has been sync()ed to other datanodes can be lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message