hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thanh Do (JIRA)" <j...@apache.org>
Subject [jira] Created: (HDFS-1229) DFSClient incorrectly asks for new block if primary crashes during first recoverBlock
Date Thu, 17 Jun 2010 04:28:24 GMT
DFSClient incorrectly asks for new block if primary crashes during first recoverBlock
-------------------------------------------------------------------------------------

                 Key: HDFS-1229
                 URL: https://issues.apache.org/jira/browse/HDFS-1229
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: hdfs client
    Affects Versions: 0.20.1
            Reporter: Thanh Do


- Setup:
+ # available datanodes = 2
+ # disks / datanode = 1
+ # failures = 1
+ failure type = crash
+ When/where failure happens = during primary's recoverBlock
 
- Details:
Say client is appending to block X1 in 2 datanodes: dn1 and dn2.
First it needs to make sure both dn1 and dn2  agree on the new GS of the block.
1) Client first creates DFSOutputStream by calling
 
>OutputStream result = new DFSOutputStream(src, buffersize, progress,
>                                            lastBlock, stat, conf.getInt("io.bytes.per.checksum",
512));
 
in DFSClient.append()
 
2) The above DFSOutputStream constructor in turn calls processDataNodeError(true, true)
(i.e, hasError = true, isAppend = true), and starts the DataStreammer
 
> processDatanodeError(true, true);  /* let's call this PDNE 1 */
> streamer.start();
 
Note that DataStreammer.run() also calls processDatanodeError()
> while (!closed && clientRunning) {
>  ...
>      boolean doSleep = processDatanodeError(hasError, false); /let's call this PDNE 2*/
 
3) Now in the PDNE 1, we have following code:
 
> blockStream = null;
> blockReplyStream = null;
> ...
> while (!success && clientRunning) {
> ...
>    try {
>         primary = createClientDatanodeProtocolProxy(primaryNode, conf);
>         newBlock = primary.recoverBlock(block, isAppend, newnodes); /*exception here*/
>         ...
>    catch (IOException e) {
>         ...
>         if (recoveryErrorCount > maxRecoveryErrorCount) { 
>         /* this condition is false */
>         }
>         ...
>         return true;
>    } // end catch
>    finally {...}
>    
>    this.hasError = false;
>    lastException = null;
>    errorIndex = 0;
>    success = createBlockOutputStream(nodes, clientName, true);
>    }
>    ...
 
Because dn1 crashes during client call to recoverBlock, we have an exception.
Hence, go to the catch block, in which processDatanodeError returns true
before setting hasError to false. Also, because createBlockOutputStream() is not called
(due to an early return), blockStream is still null.
 
4) Now PDNE 1 has finished, we come to streamer.start(), which calls PDNE 2.
Because hasError = false, PDNE 2 returns false immediately without doing anything
>    if (!hasError) {
>     return false;
>    }
 
5) still in the DataStreamer.run(), after returning false from PDNE 2, we still have
blockStream = null, hence the following code is executed:
>              if (blockStream == null) {
>               nodes = nextBlockOutputStream(src);
>               this.setName("DataStreamer for file " + src +
>                             " block " + block);
>                response = new ResponseProcessor(nodes);
>                response.start();
>              }
 
nextBlockOutputStream which asks namenode to allocate new Block is called.
(This is not good, because we are appending, not writing).
Namenode gives it new Block ID and a set of datanodes, including crashed dn1.
this leads to createOutputStream() fails because it tries to contact the dn1 first.
(which has crashed). The client retries 5 times without any success,
because every time, it asks namenode for new block! Again we see
that the retry logic at client is weird!

*This bug was found by our Failure Testing Service framework:
http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
Haryadi Gunawi (haryadi@eecs.berkeley.edu)*

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message