hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "amith (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3122) Block recovery with closeFile flag true can race with blockReport. Due to this blocks are getting marked as corrupt.
Date Wed, 04 Apr 2012 13:47:23 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13246256#comment-13246256
] 

amith commented on HDFS-3122:
-----------------------------

Hi Folks

I thought of  some sample solution for this problem please give your inputs to the same.

When any blockrecovery happen, commitBlockSynchronization is called we can store the old and
new generation stamp for this block in a map which is like 
historyMap = new HashMap<String, ArrayList<Long>>(). Here ArrayList is of size
2 which contain old and new GenerationStamp. 

For every recovery this map is updated with the block and Generation Stamps. 

Consider the scenario when BlockReport has arrived @ NN and delayed. 
Now if the any BlockRecovery completed (historyMap will have the entry of old and new Generation
Stamps). 
Now the Blockreport processing started. Here 

{code}
case RWR: 
      if (!storedBlock.isComplete()) { 
        return null; // not corrupt 
      } else if (storedBlock.getGenerationStamp() != iblk.getGenerationStamp() &&
            !historyMap.get(iblk.getBlockId()).get(0) != iblk.getGenerationStamp() )) { 
        return new BlockToMarkCorrupt(storedBlock, 
            "reported " + reportedState + " replica with genstamp " + 
            iblk.getGenerationStamp() + " does not match COMPLETE block's " + 
            "genstamp in block map " + storedBlock.getGenerationStamp()); 
      } else { // COMPLETE block, same genstamp
{code}

Here we are checking like 
if (Block GenerationStamp from BlockMap != BlockReport's Block GenerationStamp and (blockGenerationStamp
is newly changed due to recovery then check the) 
old GenerationStamp is not equal BlockReports Block GenerationStamp) then { // mark block
as corrupt the block } 

Map is populated in CommitBlockSynchronization and cleared when BlockReport is processed for
this block with new generationstamp

 


                
> Block recovery with closeFile flag true can race with blockReport. Due to this blocks
are getting marked as corrupt.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3122
>                 URL: https://issues.apache.org/jira/browse/HDFS-3122
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, name-node
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Critical
>         Attachments: blockCorrupt.txt
>
>
> *Block Report* can *race* with *Block Recovery* with closeFile flag true.
>  Block report generated just before block recovery at DN side and due to N/W problems,
block report got delayed to NN. 
> After this, recovery success and generation stamp modifies to new one. 
> And primary DN invokes the commitBlockSynchronization and block got updated in NN side.
Also block got marked as complete, since the closeFile flag was true. Updated with new genstamp.
> Now blockReport started processing at NN side. This particular block from RBW (when it
generated the BR at DN), and file was completed at NN side.
> Finally block will be marked as corrupt because of genstamp mismatch.
> {code}
>  case RWR:
>       if (!storedBlock.isComplete()) {
>         return null; // not corrupt
>       } else if (storedBlock.getGenerationStamp() != iblk.getGenerationStamp()) {
>         return new BlockToMarkCorrupt(storedBlock,
>             "reported " + reportedState + " replica with genstamp " +
>             iblk.getGenerationStamp() + " does not match COMPLETE block's " +
>             "genstamp in block map " + storedBlock.getGenerationStamp());
>       } else { // COMPLETE block, same genstamp
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message