Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9D081D030 for ; Tue, 12 Mar 2013 19:51:17 +0000 (UTC) Received: (qmail 53427 invoked by uid 500); 12 Mar 2013 19:51:14 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 53387 invoked by uid 500); 12 Mar 2013 19:51:14 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 53270 invoked by uid 99); 12 Mar 2013 19:51:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Mar 2013 19:51:14 +0000 Date: Tue, 12 Mar 2013 19:51:14 +0000 (UTC) From: "Robert Joseph Evans (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (MAPREDUCE-5060) Fetch failures that time out only count against the first map task MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-5060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-5060: ------------------------------------------- Status: Patch Available (was: Open) > Fetch failures that time out only count against the first map task > ------------------------------------------------------------------ > > Key: MAPREDUCE-5060 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-5060 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Priority: Critical > Attachments: MR-5060.txt > > > When a fetch failure happens, if the socket has already "connected" it is only counted against the first map task. But most of the time it is because of an issue with the Node itself, not the individual map task, and as such all failures when trying to initiate the connection should count against all of the tasks. > This caused a particularly unfortunate job to take an hour an a half longer then it needed to. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira