Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DD0B69216 for ; Fri, 30 Mar 2012 19:30:51 +0000 (UTC) Received: (qmail 30132 invoked by uid 500); 30 Mar 2012 19:30:51 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 30070 invoked by uid 500); 30 Mar 2012 19:30:51 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 30062 invoked by uid 99); 30 Mar 2012 19:30:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 19:30:51 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 30 Mar 2012 19:30:49 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 55BC534F3E0 for ; Fri, 30 Mar 2012 19:30:28 +0000 (UTC) Date: Fri, 30 Mar 2012 19:30:28 +0000 (UTC) From: "Robert Joseph Evans (Updated) (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1731018163.39493.1333135828360.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <464246621.39118.1333130906702.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Updated] (MAPREDUCE-4089) Hung Tasks never time out. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MAPREDUCE-4089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans updated MAPREDUCE-4089: ------------------------------------------- Attachment: MR-4089.txt This patch addresses the timeout issue, and it does so by making ping not update progress. It is still not completely compatible with 1.0, as in 1.0 if the timeout is set to 0 the task will never timeout. But because this patch makes it so ping is ignored a task that has a timeout of 0, but is so locked up that it cannot ping anymore will never timeout. I am planning to address these in a follow on JIRA, unless someone has some objections to doing so. I also have not run all of the unit tests yet. > Hung Tasks never time out. > --------------------------- > > Key: MAPREDUCE-4089 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-4089 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 0.23.2, 2.0.0, trunk > Reporter: Robert Joseph Evans > Assignee: Robert Joseph Evans > Priority: Blocker > Attachments: MR-4089.txt > > > The AM will timeout a task through mapreduce.task.timeout only when it does not hear from the task within the given timeframe. On 1.0 a task must be making progress, either by reading input from HDFS, writing output to HDFS, writing to a log, or calling a special method to inform it that it is still making progress. > This is because on 0.23 a status update which happens every 3 seconds is counted as progress. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira