hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uma Maheswara Rao G (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3162) BlockMap's corruptNodes count and CorruptReplicas map count is not matching.
Date Sat, 31 Mar 2012 20:44:24 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13243286#comment-13243286

Uma Maheswara Rao G commented on HDFS-3162:

I don't think this problem because of append usage.

Looks like this is a race between markBlockAsCorrupt and processOverReplicatedBlocks.

1) NN detects over replicated block and added to invalidates list for DNn.
2) Before processing invalidates list, BlockScanner found that block corrupted in DNn and
reported to NN.
3) Before acquiring lock, Invalidates list got processed and removed the block from blocksMap
for DNn.
4) Now markBlockAsCorrupt started processing.

// Add this replica to corruptReplicas Map 
      corruptReplicas.addToCorruptReplicasMap(storedBlockInfo, node);
      if (countNodes(storedBlockInfo).liveReplicas()>inode.getReplication()) {
        // the block is over-replicated so invalidate the replicas immediately
        invalidateBlock(storedBlockInfo, node);
      } else {
        // add the block to neededReplication 
        updateNeededReplications(storedBlockInfo, -1, 0);

since it found the enough replication and invalidateBlock. It will try to remove the storedBlock
if line Replicas are more than one.
This call will just return, because it was already removed blocksMap.

But it was already added to corruptReplicas Map(shown in the above peice of code).

So, now the counts of corruptReplicas map and blockMap are different about corrupt replicas.

Mostly this issue exists only on branch-1.

I think this problem already addressed in Trunk.

code from trunk.

// Add replica to the data-node if it is not already there

    // Add this replica to corruptReplicas Map
    corruptReplicas.addToCorruptReplicasMap(storedBlock, node, reason);
    if (countNodes(storedBlock).liveReplicas() >= inode.getReplication()) {
      // the block is over-replicated so invalidate the replicas immediately
      invalidateBlock(storedBlock, node);

see the first line above. If the block is not already there, it is adding to it. I think this
should have solved the problem in trunk.

> BlockMap's corruptNodes count and CorruptReplicas map count is not matching.
> ----------------------------------------------------------------------------
>                 Key: HDFS-3162
>                 URL: https://issues.apache.org/jira/browse/HDFS-3162
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 1.0.0
>            Reporter: suja s
>            Assignee: Uma Maheswara Rao G
>            Priority: Minor
>             Fix For: 1.0.3
> Even after invalidating the block, continuosly below log is coming
> Inconsistent number of corrupt replicas for blk_1332906029734_1719blockMap has 0 but
corrupt replicas map has 1

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message