hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4724) TaskTracker, DataNode, and SecondaryNameNode should timeout on waiting for its server to be up
Date Thu, 27 Nov 2008 11:38:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12651331#action_12651331
] 

Steve Loughran commented on HADOOP-4724:
----------------------------------------

>Something like datanode.connect.timeout, tasktracker.connect.timeout, dfsclient.connect.timeout...

Maybe include the fact that this is for IPC timeouts, not say http

datanode.ipc.connect.timeout
tasktracker.ipc.connect.timeout
dfsclient.ipc.connect.timeout

>I am thinking to start with a large number like 1 hour or 1 day. It is at least backwards
compatible.

24 hours would be good. It lets you handle the kind of outage that has the team paged in from
home and removes the "fix this in 15 minutes before the nodes start giving up" crisis

> TaskTracker, DataNode, and SecondaryNameNode should timeout on waiting for its server
to be up
> ----------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4724
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4724
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Hairong Kuang
>             Fix For: 0.20.0
>
>
> TaskTracker, DataNode, and SecondaryNameNode currently wait forever if its server is
not up. They should be designed to take a configuration parameter that tells them when to
give up, and a default value of many minutes/hours or more to deal with basic choreography
issues in a cluster. Test clusters can be set up to fail sooner rather than later.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message