Return-Path: X-Original-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3756F8592 for ; Tue, 16 Aug 2011 18:15:56 +0000 (UTC) Received: (qmail 43247 invoked by uid 500); 16 Aug 2011 18:15:56 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 43086 invoked by uid 500); 16 Aug 2011 18:15:55 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 43071 invoked by uid 99); 16 Aug 2011 18:15:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 18:15:55 +0000 X-ASF-Spam-Status: No, hits=-2001.1 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 16 Aug 2011 18:15:52 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 63E73BF34E for ; Tue, 16 Aug 2011 18:15:31 +0000 (UTC) Date: Tue, 16 Aug 2011 18:15:31 +0000 (UTC) From: "Allen Wittenauer (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: <1892717201.42144.1313518531405.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <2072931558.41877.1313512780128.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (MAPREDUCE-2846) approx 10% of all tasks fail with DefaultTaskController MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085880#comment-13085880 ] Allen Wittenauer commented on MAPREDUCE-2846: --------------------------------------------- *nods* I'm mostly convinced it is a race condition in MR-2415. I haven't had enough time to start playing in the source to track it down more. I did talk to Owen about already, but thought it might be useful to at least get the JIRA filed to put more eyes on it since race conditions are usually pretty awful to track down. > approx 10% of all tasks fail with DefaultTaskController > ------------------------------------------------------- > > Key: MAPREDUCE-2846 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-2846 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: task, task-controller, tasktracker > Affects Versions: 0.20.204.0 > Reporter: Allen Wittenauer > Priority: Blocker > > After upgrading our test 0.20.203 grid to 0.20.204-rc2, we ran terasort to verify operation. While the job completed successfully, approx 10% of the tasks failed with task runner execution errors and the inability to create symlinks for attempt logs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira