hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HDFS-1231) Generation Stamp mismatches, leading to failed append
Date Wed, 23 Jun 2010 02:19:51 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Konstantin Shvachko updated HDFS-1231:
--------------------------------------

    Affects Version/s: 0.20-append
                           (was: 0.20.1)

> Generation Stamp mismatches, leading to failed append
> -----------------------------------------------------
>
>                 Key: HDFS-1231
>                 URL: https://issues.apache.org/jira/browse/HDFS-1231
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20-append
>            Reporter: Thanh Do
>
> - Summary: the recoverBlock is not atomic, leading retrial fails when 
> facing a failure.
>  
> - Setup:
> + # available datanodes = 3
> + # disks / datanode = 1
> + # failures = 2
> + failure type = crash
> + When/where failure happens = (see below)
>  
> - Details:
> Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is primary.
> When appending, client first calls dn1.recoverBlock to make all the datanodes in 
> pipeline agree on the new Generation Stamp (GS1) and the length of the block.
> Client then sends a data packet to dn3. dn3 in turn forwards this packet to down stream
> dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER writing to
the block
> file but BEFORE writing to the meta file. Client notices the crash, it calls dn1.recoverBlock().
> dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at all dn2 and
dn1).
> Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2).
> Then it calls dn2.updateBlock(), this returns successfully.
> Now, it starts calling its own updateBlock and crashes after renaming from
> blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2.
> Therefore, dn1.recoverBlock() from the client point of view fails.
> but the GS for corresponding block has been incremented in the namenode (GS2)
> The client retries by calling dn2.recoverBlock with old GS (GS1), which does not match
with
> the new GS at the NameNode (GS1) -->exception, leading to append fails.
>  
> Now, after all, we have
> - in dn3 (which is crashed)
> tmp/blk_X
> tmp/blk_X_GS1.meta
> - in dn2
> current/blk_X
> current/blk_X_GS2
> - in dn1:
> current/blk_X
> current/blk_X_GS1.meta_tmpGS2
> - in NameNode, the block X has generation stamp GS1 (because dn1 has not called
> commitSyncronization yet).
>  
> Therefore, when crashed datanodes restart, at dn1 the block is invalid because 
> there is no meta file. In dn3, block file and meta file are finalized, however, the 
> block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2,
> which is not equal with the generation stamp info of the block maintained in NameNode.
> Hence, the block blk_X is inaccessible.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message