hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thanh Do (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1231) Generation Stamp mismatches, leading to failed append
Date Thu, 17 Jun 2010 05:25:23 GMT
Generation Stamp mismatches, leading to failed append
-----------------------------------------------------

                 Key: HDFS-1231
                 URL: https://issues.apache.org/jira/browse/HDFS-1231
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.20.1
            Reporter: Thanh Do


- Summary: the recoverBlock is not atomic, leading retrial fails when 
facing a failure.
 
- Setup:
+ # available datanodes = 3
+ # disks / datanode = 1
+ # failures = 2
+ failure type = crash
+ When/where failure happens = (see below)
 
- Details:
Suppose there are 3 datanodes in the pipeline: dn3, dn2, and dn1. Dn1 is primary.
When appending, client first calls dn1.recoverBlock to make all the datanodes in 
pipeline agree on the new Generation Stamp (GS1) and the length of the block.
Client then sends a data packet to dn3. dn3 in turn forwards this packet to down stream
dns (dn2 and dn1) and starts writing to its own disk, then it crashes AFTER writing to the
block
file but BEFORE writing to the meta file. Client notices the crash, it calls dn1.recoverBlock().
dn1.recoverBlock() first creates a syncList (by calling getMetadataInfo at all dn2 and dn1).
Then dn1 calls NameNode.getNextGS() to get new Generation Stamp (GS2).
Then it calls dn2.updateBlock(), this returns successfully.
Now, it starts calling its own updateBlock and crashes after renaming from
blk_X_GS1.meta to blk_X_GS1.meta_tmpGS2.
Therefore, dn1.recoverBlock() from the client point of view fails.
but the GS for corresponding block has been incremented in the namenode (GS2)
The client retries by calling dn2.recoverBlock with old GS (GS1), which does not match with
the new GS at the NameNode (GS1) -->exception, leading to append fails.
 
Now, after all, we have
- in dn3 (which is crashed)
tmp/blk_X
tmp/blk_X_GS1.meta
- in dn2
current/blk_X
current/blk_X_GS2
- in dn1:
current/blk_X
current/blk_X_GS1.meta_tmpGS2
- in NameNode, the block X has generation stamp GS1 (because dn1 has not called
commitSyncronization yet).
 
Therefore, when crashed datanodes restart, at dn1 the block is invalid because 
there is no meta file. In dn3, block file and meta file are finalized, however, the 
block is corrupted because CRC mismatch. In dn2, the GS of the block is GS2,
which is not equal with the generation stamp info of the block maintained in NameNode.
Hence, the block blk_X is inaccessible.

This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
Haryadi Gunawi (haryadi@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message