hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mathijs Homminga <mathijs.hommi...@knowlogy.nl>
Subject Re: Re-reduce, without re-map
Date Tue, 03 Apr 2007 10:27:16 GMT
Each reduce task (Nutch indexing job) gets as far as 66%, and then fails with the following
error:

"Task failed to report status for 600 seconds. Killing."

In the end, no reduce task completes successfully. 
Besides solves this issue, I was wondering if I could update code and configuration and start
the reduce phase again without the need to redo all map tasks (that saves me 2 hours). Assuming
of course that the output of the map tasks has not changed.

Mathijs



Arun C Murthy wrote:
> Hi Mathijs,
>
> Mathijs Homminga wrote:
>>
>> We have some troubles with the reduce phase of our job.
>> Is it possible to re-execute the reduce tasks without the need to do 
>> all map tasks again?
>>
>
>   That the MR-framework already does... you don't have to re-execute 
> the maps for the *failed* reduces. Are you noticing something else?
>
>   What are the 'troubles' you allude to? Also with once we get 
> HADOOP-1127 in, you should try turing on 'speculative execution' - 
> that helps when some tasks are very slow w.r.t other similar tasks.
>
> Arun
>
>> Thanks!
>> Mathijs Homminga
>


Mime
View raw message