hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Doug Cutting (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-181) task trackers should not restart for having a late heartbeat
Date Thu, 10 Aug 2006 06:30:18 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-181?page=comments#action_12427115 ] 
Doug Cutting commented on HADOOP-181:

> improved detection of tasktracker death is a separate issue

It's certainly related.  This issue deals with fixing things when tasktracker deaths are mis-detected.
 If we didn't misdetect so much, this would not be an issue.

> Multiple instances of a task should be handled by the speculative execution code.

What if speculative execution is disabled in the job?  Then we'd get multiple instances of
the task running at once, when the client explicitly requested that not happen.

I'm +0 on this patch.  If others feel strongly that it's the best approach, I won't veto it.
 But I would prefer we address the root problem first, and then see if this is still an issue,
before adding this new mechanism.  Does that make sense, or am I missing something?

> task trackers should not restart for having a late heartbeat
> ------------------------------------------------------------
>                 Key: HADOOP-181
>                 URL: http://issues.apache.org/jira/browse/HADOOP-181
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>            Reporter: Owen O'Malley
>         Assigned To: Devaraj Das
>             Fix For: 0.6.0
>         Attachments: lost-heartbeat.patch
> TaskTrackers should not close and restart themselves for having a late heartbeat. The
JobTracker should just accept their current status.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message