hadoop-hdfs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Resolved: (HDFS-1237) Client logic for 1st phase and 2nd phase failover are different
Date Thu, 17 Jun 2010 17:45:25 GMT

     [ https://issues.apache.org/jira/browse/HDFS-1237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Todd Lipcon resolved HDFS-1237.
-------------------------------

    Resolution: Invalid

If both DNs crash in a pipeline of 2 DNs, of course the pipeline does not recover. The likelihood
of correlated failure of all nodes in a pipeline is very small since one of the replicas is
offrack. Please reopen if you think there's _any_ action the client could take to recover
when the entire pipeline has crashed.

> Client logic for 1st phase and 2nd phase failover are different
> ---------------------------------------------------------------
>
>                 Key: HDFS-1237
>                 URL: https://issues.apache.org/jira/browse/HDFS-1237
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs client
>    Affects Versions: 0.20.1
>            Reporter: Thanh Do
>
> - Setup:
> number of datanodes = 4
> replication factor = 2 (2 datanodes in the pipeline)
> number of failure injected = 2
> failure type: crash
> Where/When failures happen: There are two scenarios: First, is when two datanodes crash
at the same time in the first phase of the pipeline. Second, when two datanodes crash at the
second phase of the pipeline.
>  
> - Details:
>  
> In this setting, we set the datanode's heartbeat message to be 1 second to the namenode.
> This is just to show that if the NN has declared a datanode dead, the DFSClient will
not
> get that dead datanode from the server. Here's our observations:
>  
> 1. If the two crashes happen during the first phase,
> the client will wait for 6 seconds (which is enough time for NN to detect
> dead datanodes in this setting). So after waiting for 6 seconds, the client
> asks the NN again, and the NN is able to give a fresh two healthy datanodes.
> and the experiment is successful!
>  
> 2. BUT, If the two crashes happen during the second phase (e.g. renameTo).
> The client *never waits for 6 secs* which implies that the logic of the client
> for 1st phase and 2nd phase are different.  What happens here, DFSClient gives
> up and (we believe) it never falls back to the outer while loop to contact the
> NN again.  So the two crashes in this second phase are not masked properly,
> and the write operation fails. 
>  
> In summary, scenario (1) is good, but scenario (2) is not successful. This shows
> a bad retry logic during the second phase.  (We note again that we change
> the setup a bit by setting the DN's hearbeat interval to 1 second.  If we use
> the default interval, scenario (1) will fail too because the NN will give the
> client the same dead datanodes).
> This bug was found by our Failure Testing Service framework:
> http://www.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-98.html
> For questions, please email us: Thanh Do (thanhdo@cs.wisc.edu) and
> Haryadi Gunawi (haryadi@eecs.berkeley.edu)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message