Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 69777 invoked from network); 30 Jun 2006 15:35:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 30 Jun 2006 15:35:11 -0000 Received: (qmail 13379 invoked by uid 500); 30 Jun 2006 15:35:10 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 13353 invoked by uid 500); 30 Jun 2006 15:35:10 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 13344 invoked by uid 99); 30 Jun 2006 15:35:10 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jun 2006 08:35:10 -0700 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [209.237.227.198] (HELO brutus.apache.org) (209.237.227.198) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Jun 2006 08:35:09 -0700 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 9A5B8410340 for ; Fri, 30 Jun 2006 15:33:30 +0000 (GMT) Message-ID: <28656095.1151681610629.JavaMail.jira@brutus> Date: Fri, 30 Jun 2006 15:33:30 +0000 (GMT+00:00) From: "Johan Oskarson (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Updated: (HADOOP-76) Implement speculative re-execution of reduces In-Reply-To: <1176204687.1142031469402.JavaMail.jira@ajax> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N [ http://issues.apache.org/jira/browse/HADOOP-76?page=all ] Johan Oskarson updated HADOOP-76: --------------------------------- Attachment: spec_reducev.patch I've tried to implement speculative reduces and it seems to be working, however I'd like you to take a look at it since I'm not familiar with some of the inner workings of hadoop. As suggested it writes output to a temporary name and the first one to finish moves it to the correct output name. The patch adds a String tmpName to getRecordWriter in OutputFormatBase and a close method. Basically the OutputFormatBase keeps track of the tmpName and the final name once close is called it moves the tmp to the final. This means the current output formats doesn't have to be changed. This patch would ideally be complemented by a better tasktracker selection, I've seen instances where there's two final reduce tips and then a speculative reduce is assigned to the same node that is already running the other task. A speculative reduce will be started if finishedReduces / numReduceTasks >= 0.7 That's about it, looking forward to hear your input > Implement speculative re-execution of reduces > --------------------------------------------- > > Key: HADOOP-76 > URL: http://issues.apache.org/jira/browse/HADOOP-76 > Project: Hadoop > Type: Improvement > Components: mapred > Versions: 0.1.0 > Reporter: Doug Cutting > Assignee: Owen O'Malley > Priority: Minor > Fix For: 0.5.0 > Attachments: spec_reducev.patch > > As a first step, reduce task outputs should go to temporary files which are renamed when the task completes. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira