hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Johan Oskarson (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-76) Implement speculative re-execution of reduces
Date Fri, 30 Jun 2006 15:33:30 GMT
     [ http://issues.apache.org/jira/browse/HADOOP-76?page=all ]

Johan Oskarson updated HADOOP-76:

    Attachment: spec_reducev.patch

I've tried to implement speculative reduces and it seems to be working, however I'd like you
to take a look at it since I'm not familiar with some of the inner workings of hadoop.

As suggested it writes output to a temporary name and the first one to finish moves it to
the correct output name.
The patch adds a String tmpName to getRecordWriter in OutputFormatBase
and a close method. Basically the OutputFormatBase keeps track of the tmpName and the final
once close is called it moves the tmp to the final.

This means the current output formats doesn't have to be changed.

This patch would ideally be complemented by a better tasktracker selection, I've seen instances
where there's two final reduce tips and then a speculative reduce is assigned to the same
node that is already running the other task.

A speculative reduce will be started if finishedReduces / numReduceTasks >= 0.7

That's about it, looking forward to hear your input

> Implement speculative re-execution of reduces
> ---------------------------------------------
>          Key: HADOOP-76
>          URL: http://issues.apache.org/jira/browse/HADOOP-76
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Owen O'Malley
>     Priority: Minor
>      Fix For: 0.5.0
>  Attachments: spec_reducev.patch
> As a first step, reduce task outputs should go to temporary files which are renamed when
the task completes.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message