hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Created: (HADOOP-1201) Progress reporting can be improved for both Map/Reduce tasks
Date Wed, 04 Apr 2007 05:50:32 GMT
Progress reporting can be improved for both Map/Reduce tasks
------------------------------------------------------------

                 Key: HADOOP-1201
                 URL: https://issues.apache.org/jira/browse/HADOOP-1201
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Devaraj Das
             Fix For: 0.13.0


Both the map and reduce tasks do progress reporting in separate threads. However, in the ReduceTask,
after the sort phase, the progress reporting happens inline with the reducer invocations.
This slows down the Reduce phase since RPC is involved for every progress report. The better
thing to do would be to do progress reporting for all phases in separate threads and have
the tasks just update the progress fields.
One proposal is to extract out the reporting stuff that is there in MapTask/ReduceTask and
put it in the Task superclass as a new class, and have methods in the new class that control
what/when progress is reported. Thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message