hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HDFS-1233) Bad retry logic at DFSClient
Date Thu, 17 Jun 2010 17:35:27 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Todd Lipcon resolved HDFS-1233.

    Resolution: Won't Fix

This is a known deficiency, don't think anyone has plans to fix it. Any cluster that has multiple
disks per DN likely has multiple DNs too.

> Bad retry logic at DFSClient
> ----------------------------
>                 Key: HDFS-1233
>                 URL: https://issues.apache.org/jira/browse/HDFS-1233
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
> - Summary: failover bug, bad retry logic at DFSClient, cannot failover to the 2nd disk
> - Setups:
> + # available datanodes = 1
> + # disks / datanode = 2
> + # failures = 1
> + failure type = bad disk
> + When/where failure happens = (see below)
> - Details:
> The setup is:
> 1 datanode, 1 replica, and each datanode has 2 disks (Disk1 and Disk2).
> We injected a single disk failure to see if we can failover to the
> second disk or not.
> If a persistent disk failure happens during createBlockOutputStream
> (the first phase of the pipeline creation) (e.g. say DN1-Disk1 is bad),
> then createBlockOutputStream (cbos) will get an exception and it
> will retry!  When it retries it will get the same DN1 from the namenode,
> and then DN1 will call DN.writeBlock(), FSVolume.createTmpFile,
> and finally getNextVolume() which a moving volume#.  Thus, on the
> second try, the write will be successfully go to the second disk.
> So essentially createBlockOutputStream is wrapped in a
> do/while(retry && --count >= 0). The first cbos will fail, the second
> will be successful in this particular scenario.
> NOW, say cbos is successful, but the failure is persistent.
> Then the "retry" is in a different while loop.
> First, hasError is set to true in RP.run (responder packet).
> Thus, DataStreamer.run() will go back to the loop:
> while(!closed && clientRunning && !lastPacketInBlock).
> Now this second iteration of the loop will call
> processDatanodeError because hasError has been set to true.
> In processDatanodeError (pde), the client sees that this is the only datanode
> in the pipeline, and hence it considers that the node is bad! Although actually
> only 1 disk is bad!  Hence, pde throws IOException suggesting
> all the datanodes (in this case, only DN1) in the pipeline is bad.
> Hence, in this error, the exception is thrown to the client.
> But if the exception, say, is catched by the most outer while loop
> do-while(retry && --count >= 0), then this outer retry will be
> successful then (as suggested in the previous paragraph).
> In summary, if in a deployment scenario, we only have one datanode
> that has multiple disks, and one disk goes bad, then the current
> retry logic at the DFSClient side is not robust enough to mask the
> failure from the client.
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and 
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message