Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hadoop-dev@lucene.apache.org
Message-ID: <28656095.1151681610629.JavaMail.jira@brutus>
Date: Fri, 30 Jun 2006 15:33:30 +0000 (GMT+00:00)
From: "Johan Oskarson (JIRA)" <jira@apache.org>
To: hadoop-dev@lucene.apache.org
Subject: [jira] Updated: (HADOOP-76) Implement speculative re-execution of
 reduces
In-Reply-To: <1176204687.1142031469402.JavaMail.jira@ajax>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

     [ http://issues.apache.org/jira/browse/HADOOP-76?page=all ]

Johan Oskarson updated HADOOP-76:
---------------------------------

    Attachment: spec_reducev.patch

I've tried to implement speculative reduces and it seems to be working, however I'd like you to take a look at it since I'm not familiar with some of the inner workings of hadoop.

As suggested it writes output to a temporary name and the first one to finish moves it to the correct output name.
The patch adds a String tmpName to getRecordWriter in OutputFormatBase
and a close method. Basically the OutputFormatBase keeps track of the tmpName and the final name
once close is called it moves the tmp to the final.

This means the current output formats doesn't have to be changed.

This patch would ideally be complemented by a better tasktracker selection, I've seen instances where there's two final reduce tips and then a speculative reduce is assigned to the same node that is already running the other task.

A speculative reduce will be started if finishedReduces / numReduceTasks >= 0.7

That's about it, looking forward to hear your input

> Implement speculative re-execution of reduces
> ---------------------------------------------
>
>          Key: HADOOP-76
>          URL: http://issues.apache.org/jira/browse/HADOOP-76
>      Project: Hadoop
>         Type: Improvement

>   Components: mapred
>     Versions: 0.1.0
>     Reporter: Doug Cutting
>     Assignee: Owen O'Malley
>     Priority: Minor
>      Fix For: 0.5.0
>  Attachments: spec_reducev.patch
>
> As a first step, reduce task outputs should go to temporary files which are renamed when the task completes.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira