hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chang Li (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9289) check genStamp when complete file
Date Tue, 27 Oct 2015 20:58:28 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977156#comment-14977156

Chang Li commented on HDFS-9289:

[~zhz], yes, the above log is from the same cluster as the first log I post.

The two replicas in two datanodes from updated pipeline had new GS but they were marked as
corrupt because the block commit with old genstamp. 
The complete story happened in that cluster is:  there were initially 3 datanodes in pipeline
d1, d2, d3. Then pipelineupdate happen with only d2 and d3 with new GS. Then file complete
with old GS and d2 and d3 were marked corrupt. Then after 1 day, full block report from d1
came in, and NN found out d1 has the the right block with "correct" old GS but d1 is under
replicated, so NN told d1 to replicate its replica with old GS to the other two nodes, d4,
d5. So the all 3DNs I showed above were d1, d4, and d5 having old GS.
I think there probabaly exist some cache coherence issue since 
{code}protected ExtendedBlock block;{code}
lack volatile. That could also explain why this issue didn't happen frequently.

> check genStamp when complete file
> ---------------------------------
>                 Key: HDFS-9289
>                 URL: https://issues.apache.org/jira/browse/HDFS-9289
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Chang Li
>            Assignee: Chang Li
>            Priority: Critical
>         Attachments: HDFS-9289.1.patch, HDFS-9289.2.patch, HDFS-9289.3.patch
> we have seen a case of corrupt block which is caused by file complete after a pipelineUpdate,
but the file complete with the old block genStamp. This caused the replicas of two datanodes
in updated pipeline to be viewed as corrupte. Propose to check genstamp when commit block

This message was sent by Atlassian JIRA

View raw message