Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 63D74200B95 for ; Tue, 27 Sep 2016 15:58:27 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6245E160AD3; Tue, 27 Sep 2016 13:58:27 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id A7289160AD2 for ; Tue, 27 Sep 2016 15:58:26 +0200 (CEST) Received: (qmail 95748 invoked by uid 500); 27 Sep 2016 13:58:20 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 95463 invoked by uid 99); 27 Sep 2016 13:58:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 27 Sep 2016 13:58:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 8473C2C2A66 for ; Tue, 27 Sep 2016 13:58:20 +0000 (UTC) Date: Tue, 27 Sep 2016 13:58:20 +0000 (UTC) From: "Jason Lowe (JIRA)" To: mapreduce-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (MAPREDUCE-6771) RMContainerAllocator sends container diagnostics event after corresponding completion event MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 27 Sep 2016 13:58:27 -0000 [ https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15526217#comment-15526217 ] Jason Lowe commented on MAPREDUCE-6771: --------------------------------------- Odd, the precommit came back for patch v3 instead of v4. Kicking it again. > RMContainerAllocator sends container diagnostics event after corresponding completion event > ------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-6771 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771 > Project: Hadoop Map/Reduce > Issue Type: Bug > Components: mrv2 > Affects Versions: 2.7.3 > Reporter: Haibo Chen > Assignee: Haibo Chen > Attachments: TaUnsuccessfullyEventEmission.jpg, mapreduce6771.001.patch, mapreduce6771.002.patch, mapreduce6771.003.patch, mapreduce6771.004.patch > > > Task containers can go over their resource limit, and killed by Node Manager. Then MR AM gets notified of the container status and diagnostics information through its heartbeat with RM. However, it is possible that the diagnostics information never gets into .jhist file, so when the job completes, the diagnostics information associated with the failed task attempts is empty. This makes it hard for users to root cause job failures that are often caused by memory leak. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org