Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 60277B9AB for ; Mon, 16 Jan 2012 09:14:17 +0000 (UTC) Received: (qmail 33947 invoked by uid 500); 16 Jan 2012 09:14:15 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 33748 invoked by uid 500); 16 Jan 2012 09:14:07 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 33707 invoked by uid 99); 16 Jan 2012 09:14:01 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2012 09:14:01 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Jan 2012 09:14:00 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 6040614F906 for ; Mon, 16 Jan 2012 09:13:40 +0000 (UTC) Date: Mon, 16 Jan 2012 09:13:40 +0000 (UTC) From: "Harsh J (Resolved) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1984098717.44041.1326705220411.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Resolved] (MAPREDUCE-460) Should be able to re-run jobs, collecting only missing output MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved MAPREDUCE-460. ------------------------------- Resolution: Not A Problem This has gone stale, closing out. We can discuss how best to solve this in a new ticket now that MR2 is out. > Should be able to re-run jobs, collecting only missing output > ------------------------------------------------------------- > > Key: MAPREDUCE-460 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-460 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Reporter: Bryan Pendleton > Assignee: Owen O'Malley > Priority: Minor > > For jobs with no side effects (roughly == jobs with speculative execution enabled), if partial output has been generated, it should be possible to re-run the job, and fill in the missing pieces. I have now run the same job twice, once finishing 42 of 44 reduce tasks, another time finishing only 17. Each time, many nodes have failed, causing many many tasks to fail ( in one case, 5k failures from 15k map tasks, 23 failures from 44 reduces), but some valid output was generated. Since the output is only dependent on the input, and both jobs used the same input, I will now be able to combine these two failed task outputs to get a completed job's output. This should be something that can be more automatic. > In particular, it should be possible to resubmit a job, with a list of partitions that should be ignored. A special Combiner, or pre-Combiner, would throw out any map output for partitions that have already been successfully completed, thus reducing the amount of data that needs to be reduced to complete the job. It would, of course, be nice to support "filling in" existing outputs, rather than having to do a move operation on completed outputs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira