Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 89387 invoked from network); 23 Oct 2009 20:42:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Oct 2009 20:42:23 -0000 Received: (qmail 55137 invoked by uid 500); 23 Oct 2009 20:42:23 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 55048 invoked by uid 500); 23 Oct 2009 20:42:23 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 55020 invoked by uid 99); 23 Oct 2009 20:42:23 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Oct 2009 20:42:23 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 23 Oct 2009 20:42:20 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id A85AC234C4B7 for ; Fri, 23 Oct 2009 13:41:59 -0700 (PDT) Message-ID: <632673652.1256330519688.JavaMail.jira@brutus> Date: Fri, 23 Oct 2009 20:41:59 +0000 (UTC) From: "Aaron Kimball (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Updated: (MAPREDUCE-1119) When tasks fail to report status, show tasks's stack dump before killing In-Reply-To: <44290688.1255733731270.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Kimball updated MAPREDUCE-1119: ------------------------------------- Status: Patch Available (was: Open) > When tasks fail to report status, show tasks's stack dump before killing > ------------------------------------------------------------------------ > > Key: MAPREDUCE-1119 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1119 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: tasktracker > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: Aaron Kimball > Attachments: MAPREDUCE-1119.2.patch, MAPREDUCE-1119.patch > > > When the TT kills tasks that haven't reported status, it should somehow gather a stack dump for the task. This could be done either by sending a SIGQUIT (so the dump ends up in stdout) or perhaps something like JDI to gather the stack directly from Java. This may be somewhat tricky since the child may be running as another user (so the SIGQUIT would have to go through LinuxTaskController). This feature would make debugging these kinds of failures much easier, especially if we could somehow get it into the TaskDiagnostic message -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.