hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hairong Kuang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-724) Pipeline close hangs if one of the datanode is not responsive.
Date Fri, 06 Nov 2009 19:26:32 GMT

    [ https://issues.apache.org/jira/browse/HDFS-724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12774382#action_12774382
] 

Hairong Kuang commented on HDFS-724:
------------------------------------

If a datanode really becomes non-responsive, the dfs client is able to detect the problem.

The issue here is that the test simulates a non-responsive block receiver. While the block
receiver is blocked, the packet responder is still alive and sends heartbeats back periodically.
So the client still thinks the pipeline is working good.

The solution:
1. packet responder does not send heartbeats. instead, turn on the tcp/ip level heartbeats
by setting the socket "keepalive" to be true.
2. dfs client does not receive acks until there is one outstanding packet.

> Pipeline close hangs if one of the datanode is not responsive.
> --------------------------------------------------------------
>
>                 Key: HDFS-724
>                 URL: https://issues.apache.org/jira/browse/HDFS-724
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node, hdfs client
>            Reporter: Tsz Wo (Nicholas), SZE
>         Attachments: h724_20091021.patch
>
>
> In the new pipeline design, pipeline close is implemented by sending an additional empty
packet.  If one of the datanode does not response to this empty packet, the pipeline hangs.
 It seems that there is no timeout.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message