hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-547) ReduceTaskRunner can miss sending hearbeats if no map output copy finishes within "mapred.task.timeout"
Date Mon, 25 Sep 2006 18:37:51 GMT
    [ http://issues.apache.org/jira/browse/HADOOP-547?page=comments#action_12437635 ] 
Owen O'Malley commented on HADOOP-547:

Instead of adding a new timer to the ReduceTaskRunner, I think it would be far easier to have
the PingTimer just call reportProgress when the progress() method is called.

To get access to the TaskTracker and Task, PingTimer would be a non-static inner class instead
of static.

> ReduceTaskRunner can miss sending hearbeats if no map output copy finishes within "mapred.task.timeout"
> -------------------------------------------------------------------------------------------------------
>                 Key: HADOOP-547
>                 URL: http://issues.apache.org/jira/browse/HADOOP-547
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.6.2
>            Reporter: Sanjay Dahiya
>         Assigned To: Sanjay Dahiya
>         Attachments: Hadoop-547.patch
> In ReduceTaskRunner, main loop sending heartbeats waits on copyResults, which releases
only if a copy thread finishes copying. This can cause good reduce tasks which are copying
data to fail, if no map task output was copied within "mapred.task.timeout". 
> ReduceTaskRunner.java:490
>         try {
>           copyResults.wait();                      <=========== Calls unconditional
>         } catch (InterruptedException e) { }
> wait() should be with a timeout, possibly taskTimeout/2 after which it should send a
hearbeat and go back to wait. 

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message