Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 47BE310C66 for ; Fri, 2 Aug 2013 00:09:50 +0000 (UTC) Received: (qmail 65435 invoked by uid 500); 2 Aug 2013 00:09:50 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 65391 invoked by uid 500); 2 Aug 2013 00:09:50 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 65378 invoked by uid 99); 2 Aug 2013 00:09:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Aug 2013 00:09:49 +0000 Date: Fri, 2 Aug 2013 00:09:49 +0000 (UTC) From: "Jason Lowe (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-5251) Reducer should not implicate map attempt if it has insufficient space to fetch map output MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Lowe updated MAPREDUCE-5251: ---------------------------------- Fix Version/s: (was: 2.3.0) 2.1.1-beta I pulled this into branch-2.1-beta as well. > Reducer should not implicate map attempt if it has insufficient space to fetch map output > ----------------------------------------------------------------------------------------- > > Key: MAPREDUCE-5251 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5251 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.7, 2.0.4-alpha > Reporter: Jason Lowe > Assignee: Ashwin Shankar > Fix For: 3.0.0, 0.23.10, 2.1.1-beta > > Attachments: MAPREDUCE-5251-2.txt, MAPREDUCE-5251-3.txt, MAPREDUCE-5251-4.txt, MAPREDUCE-5251-5.txt, MAPREDUCE-5251-6.txt, MAPREDUCE-5251-7-b23.txt, MAPREDUCE-5251-7.txt > > > A job can fail if a reducer happens to run on a node with insufficient space to hold a map attempt's output. The reducer keeps reporting the map attempt as bad, and if the map attempt ends up being re-launched too many times before the reducer decides maybe it is the real problem the job can fail. > In that scenario it would be better to re-launch the reduce attempt and hopefully it will run on another node that has sufficient space to complete the shuffle. Reporting the map attempt is bad and relaunching the map task doesn't change the fact that the reducer can't hold the output. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira