Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 56A461888E for ; Thu, 29 Oct 2015 16:15:33 +0000 (UTC) Received: (qmail 76399 invoked by uid 500); 29 Oct 2015 16:15:28 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 76355 invoked by uid 500); 29 Oct 2015 16:15:28 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 76272 invoked by uid 99); 29 Oct 2015 16:15:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Oct 2015 16:15:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id D7C8F2C1F69 for ; Thu, 29 Oct 2015 16:15:27 +0000 (UTC) Date: Thu, 29 Oct 2015 16:15:27 +0000 (UTC) From: "Steve Loughran (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4309) Add debug information to application logs when a container fails MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14980704#comment-14980704 ] Steve Loughran commented on YARN-4309: -------------------------------------- I mean all the environment variables of the container > Add debug information to application logs when a container fails > ---------------------------------------------------------------- > > Key: YARN-4309 > URL: https://issues.apache.org/jira/browse/YARN-4309 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager > Reporter: Varun Vasudev > Assignee: Varun Vasudev > > Sometimes when a container fails, it can be pretty hard to figure out why it failed. > My proposal is that if a container fails, we collect information about the container local dir and dump it into the container log dir. Ideally, I'd like to tar up the directory entirely, but I'm not sure of the security and space implications of such a approach. At the very least, we can list all the files in the container local dir, and dump the contents of launch_container.sh(into the container log dir). > When log aggregation occurs, all this information will automatically get collected and make debugging such failures much easier. -- This message was sent by Atlassian JIRA (v6.3.4#6332)