hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-181) task trackers should not restart for having a late heartbeat
Date Fri, 11 Aug 2006 18:40:15 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-181?page=comments#action_12427581 ] 
Doug Cutting commented on HADOOP-181:

> Why does one turn off speculative execution?

In the case of the Nutch crawler, speculative execution is disabled to observe politeness.
 We do not want two tasks to attempt to fetch pages from a site at the same time.

This patch adds a fair amount of complexity, introducing a new state for tasks (presumed dead,
but reanimateable).  A new state is likely to add new failure modes.

Does anyone deny that this primarily addresses an issue that would go away if we could more
reliably detect tasktracker death?  Shouldn't we attempt to fix that first?  Sameer raises
the issue of "transient network problems".  Are we actually seeing these?  Even if these were
to occur, the system would operate correctly as-is: this is an optimization.  Is this a common-enough
case that we can afford to optimize it?

> task trackers should not restart for having a late heartbeat
> ------------------------------------------------------------
>                 Key: HADOOP-181
>                 URL: http://issues.apache.org/jira/browse/HADOOP-181
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Devaraj Das
>             Fix For: 0.6.0
>         Attachments: lost-heartbeat.patch
> TaskTrackers should not close and restart themselves for having a late heartbeat. The
JobTracker should just accept their current status.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message