Mailing-List: contact hdfs-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-dev@hadoop.apache.org
Message-ID: <16898327.14351280193257599.JavaMail.jira@thor>
Date: Mon, 26 Jul 2010 21:14:17 -0400 (EDT)
From: "Todd Lipcon (JIRA)" <jira@apache.org>
To: hdfs-dev@hadoop.apache.org
Subject: [jira] Resolved: (HDFS-1229) DFSClient incorrectly asks for new
 block if primary crashes during first recoverBlock
In-Reply-To: <10150724.48121276748904227.JavaMail.jira@thor>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


     [ https://issues.apache.org/jira/browse/HDFS-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Todd Lipcon resolved HDFS-1229.
-------------------------------

    Resolution: Cannot Reproduce

Thanks for checking against the append branch, marking as resolved since the bug is fixed.

> DFSClient incorrectly asks for new block if primary crashes during first recoverBlock
> -------------------------------------------------------------------------------------
>
>                 Key: HDFS-1229
>                 URL: https://issues.apache.org/jira/browse/HDFS-1229
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20-append
>            Reporter: Thanh Do
>
> Setup:
> --------
> + # available datanodes = 2
> + # disks / datanode = 1
> + # failures = 1
> + failure type = crash
> + When/where failure happens = during primary's recoverBlock
>  
> Details:
> ----------
> Say client is appending to block X1 in 2 datanodes: dn1 and dn2.
> First it needs to make sure both dn1 and dn2  agree on the new GS of the block.
> 1) Client first creates DFSOutputStream by calling
>  
> >OutputStream result = new DFSOutputStream(src, buffersize, progress,
> >                                            lastBlock, stat, conf.getInt("io.bytes.per.checksum", 512));
>  
> in DFSClient.append()
>  
> 2) The above DFSOutputStream constructor in turn calls processDataNodeError(true, true)
> (i.e, hasError = true, isAppend = true), and starts the DataStreammer
>  
> > processDatanodeError(true, true);  /* let's call this PDNE 1 */
> > streamer.start();
>  
> Note that DataStreammer.run() also calls processDatanodeError()
> > while (!closed && clientRunning) {
> >  ...
> >      boolean doSleep = processDatanodeError(hasError, false); /let's call this PDNE 2*/
>  
> 3) Now in the PDNE 1, we have following code:
>  
> > blockStream = null;
> > blockReplyStream = null;
> > ...
> > while (!success && clientRunning) {
> > ...
> >    try {
> >         primary = createClientDatanodeProtocolProxy(primaryNode, conf);
> >         newBlock = primary.recoverBlock(block, isAppend, newnodes); /*exception here*/
> >         ...
> >    catch (IOException e) {
> >         ...
> >         if (recoveryErrorCount > maxRecoveryErrorCount) { 
> >         // this condition is false
> >         }
> >         ...
> >         return true;
> >    } // end catch
> >    finally {...}
> >    
> >    this.hasError = false;
> >    lastException = null;
> >    errorIndex = 0;
> >    success = createBlockOutputStream(nodes, clientName, true);
> >    }
> >    ...
>  
> Because dn1 crashes during client call to recoverBlock, we have an exception.
> Hence, go to the catch block, in which processDatanodeError returns true
> before setting hasError to false. Also, because createBlockOutputStream() is not called
> (due to an early return), blockStream is still null.
>  
> 4) Now PDNE 1 has finished, we come to streamer.start(), which calls PDNE 2.
> Because hasError = false, PDNE 2 returns false immediately without doing anything
> > if (!hasError) { return false; }
>  
> 5) still in the DataStreamer.run(), after returning false from PDNE 2, we still have
> blockStream = null, hence the following code is executed:
> if (blockStream == null) {
>    nodes = nextBlockOutputStream(src);
>    this.setName("DataStreamer for file " + src + " block " + block);
>    response = new ResponseProcessor(nodes);
>    response.start();
> }
>  
> nextBlockOutputStream which asks namenode to allocate new Block is called.
> (This is not good, because we are appending, not writing).
> Namenode gives it new Block ID and a set of datanodes, including crashed dn1.
> this leads to createOutputStream() fails because it tries to contact the dn1 first.
> (which has crashed). The client retries 5 times without any success,
> because every time, it asks namenode for new block! Again we see
> that the retry logic at client is weird!
> *This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)*

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.