hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Wang, Chengwei" <wan...@gatech.edu>
Subject Questions about progress score and speculative execution
Date Fri, 12 Nov 2010 17:16:19 GMT
Thanks Rekha, it is really helpful!

Could you, or anybody, please also help me understand following questions?

1. How could I get the progress score of each task (map or reduce). Can I have them from the
log files, directly or by configuring them to "debug" mode or I need to change the source
of Hadoop? 

2. For speculative execution, hadoop looks at the average progress score of map tasks( or
of reduce tasks ) and compare a task's progress score with the average. If it is less than
the average - 0.2, the task is a straggler. For example, if there are 10 map tasks, we first
compute the average progress score of the 10 map tasks, then we compare each of the 10 map
tasks to the average to find the straggler. Am I right on the algorithm? Please do correct
me if I am wrong.

Thanks a lot!

----- Original Message -----
From: "Rekha Joshi" <rekhajos@yahoo-inc.com>
To: common-dev@hadoop.apache.org
Sent: Friday, November 12, 2010 12:53:25 AM
Subject: Re: about the task statistics in the history directory

Hi Chengwei,

If it helps, reading the hadoop tutorial, the configuration files along with API JobHistory*
pages would provide you the main details.
For eg: http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobHistory.MapAttempt.html

There is a typo on api - http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobHistory
"JobHistory.ReduceAttempt <http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/JobHistory.ReduceAttempt.html>
          Helper class for logging or reading back events related to start, finish or failure
of  a Map Attempt on a node."

It should be "Reduce" instead of "Map".Use your judgment. :)

Just an example that only code is gospel truth, api/document are guiding force.

Thanks & Regards,

On 11/12/10 7:57 AM, "Wang, Chengwei" <wangcw@gatech.edu> wrote:

HI All,

I just wonder if there is any doc explaining the terms in the task statistics in the logs/history/
? For example 'SPLITS', 'MapAttempt'?

Thanks a lot for enlightening.


View raw message