hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-7488) When Namenode network is unplugged, DFSClient operations waits for ever
Date Sun, 07 Aug 2011 00:25:27 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-7488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13080491#comment-13080491
] 

Konstantin Shvachko commented on HADOOP-7488:
---------------------------------------------

If {{rpcTimeout > 0}} then {{ handleTimeout()}} will throw {{SocketTimeoutException}} instead
of going into ping loop. Can you control the required behavior by setting {{rpcTimeout >
0}} rather introducing the # of pings limit.

DataNodes and TaskTrackers are designed to ping NN and JT infinitely, because during startup
you cannot predict when NN will come online as it depends on the size of the image and edits.
Also when NN becomes busy it is important for DNs to keep retrying rather than assuming the
NN is dead.

For DFSClient this may make sense, but I think they already timeout. At list DFSShell ls does.
And even if they don't this should be an HDFS change not generic IPC change, which affects
many Hadoop components.
 
As for HA I don't know what you did for HA and therefore cannot understand what problem you
are trying to solve here. I can guess that you want DNs switch to another NN when they timeout
rather than retrying. In this case you should be able to use rpcTimeout.

> When Namenode network is unplugged, DFSClient operations waits for ever
> -----------------------------------------------------------------------
>
>                 Key: HADOOP-7488
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7488
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: ipc
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>         Attachments: HADOOP-7488.patch
>
>
> When NN/DN is shutdown gracefully, the DFSClient operations which are waiting for a response
from NN/DN, will throw exception & come out quickly
> But when the NN/DN network is unplugged, the DFSClient operations which are waiting for
a response from NN/DN, waits for ever.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message