hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jason Lowe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-6771) Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.
Date Mon, 29 Aug 2016 15:02:20 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-6771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15446117#comment-15446117
] 

Jason Lowe commented on MAPREDUCE-6771:
---------------------------------------

bq. Not sure how a unit test can be written. Any suggestion is greatly appreciated.

A unit test could verify that when the RMCommunicator receives a container completion event
with diagnostics it sends the diagnostic event _before_ it sends the completion event.  That
test will fail before this change and pass afterwards.

> Diagnostics information can be lost in .jhist if task containers are killed by Node Manager.
> --------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6771
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6771
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv2
>    Affects Versions: 2.7.3
>            Reporter: Haibo Chen
>            Assignee: Haibo Chen
>         Attachments: mapreduce6771.001.patch
>
>
> Task containers can go over their resource limit, and killed by Node Manager. Then MR
AM gets notified of the container status and diagnostics information through its heartbeat
with RM.  However, it is possible that the diagnostics information never gets into .jhist
file, so when the job completes, the diagnostics information associated with the failed task
attempts is empty.  This makes it hard for users to root cause job failures that are often
caused by memory leak.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-issues-help@hadoop.apache.org


Mime
View raw message