hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "dhruba borthakur (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1237) Client logic for 1st phase and 2nd phase failover are different
Date Fri, 10 Sep 2010 08:54:41 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12907935#action_12907935

dhruba borthakur commented on HDFS-1237:

If two replicas of an existing block is already on two datanodes and a client wants to append
to that block but the two datanodes are dead, then there is nothing much that we can do.

For the non-apend case, when a client is creating and writing to a file (starting from the
beginning) it is possible to recover even if both the datanodes in the pipeline dies if the
client caches the block till the entire block is written. If this is the intent of this JIRA,
then it should be more of an "improvement" rather than a bug.

> Client logic for 1st phase and 2nd phase failover are different
> ---------------------------------------------------------------
>                 Key: HDFS-1237
>                 URL: https://issues.apache.org/jira/browse/HDFS-1237
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
> - Setup:
> number of datanodes = 4
> replication factor = 2 (2 datanodes in the pipeline)
> number of failure injected = 2
> failure type: crash
> Where/When failures happen: There are two scenarios: First, is when two datanodes crash
at the same time in the first phase of the pipeline. Second, when two datanodes crash at the
second phase of the pipeline.
> - Details:
> In this setting, we set the datanode's heartbeat message to be 1 second to the namenode.
> This is just to show that if the NN has declared a datanode dead, the DFSClient will
> get that dead datanode from the server. Here's our observations:
> 1. If the two crashes happen during the first phase,
> the client will wait for 6 seconds (which is enough time for NN to detect
> dead datanodes in this setting). So after waiting for 6 seconds, the client
> asks the NN again, and the NN is able to give a fresh two healthy datanodes.
> and the experiment is successful!
> 2. BUT, If the two crashes happen during the second phase (e.g. renameTo).
> The client *never waits for 6 secs* which implies that the logic of the client
> for 1st phase and 2nd phase are different.  What happens here, DFSClient gives
> up and (we believe) it never falls back to the outer while loop to contact the
> NN again.  So the two crashes in this second phase are not masked properly,
> and the write operation fails. 
> In summary, scenario (1) is good, but scenario (2) is not successful. This shows
> a bad retry logic during the second phase.  (We note again that we change
> the setup a bit by setting the DN's hearbeat interval to 1 second.  If we use
> the default interval, scenario (1) will fail too because the NN will give the
> client the same dead datanodes).
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message