hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Payne (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MAPREDUCE-5044) Have AM trigger jstack on task attempts that timeout before killing them
Date Sat, 20 Feb 2016 21:39:18 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Eric Payne updated MAPREDUCE-5044:
    Attachment: MAPREDUCE-5044.v07.local.patch

Thanks, [~jira.shegalov] for all of the work already done on this JIRA.

I have upmerged the latest patch and integrated it with the {{SignalContainerRequest}} that
was added as part of YARN-445 and its children.

[~mingma], [~xgong], [~jlowe], [~jira.shegalov], would you please take a look?

I would like to see functionality in this JIRA implemented. We occasionally see containers
time out, and it would be good if users could have direct feedback in the form of a jstack
to help them debug their applications.

IIUC, YARN-445 and its children put in place the infrastructure for a {{Client -> RM ->
NM -> Container}} signal path. However, in order to automatically dump the jstack when
a container times out, we still need an {{AM -> NM -> Container}} signal path. This
JIRA (MAPREDUCE-5044 along with YARN-1515) adds this signal path along with the ability to
send multiple signals per call.

I think sending multiple signals per call could be split into a separate JIRA.

> Have AM trigger jstack on task attempts that timeout before killing them
> ------------------------------------------------------------------------
>                 Key: MAPREDUCE-5044
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5044
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: mr-am
>    Affects Versions: 2.1.0-beta
>            Reporter: Jason Lowe
>            Assignee: Gera Shegalov
>         Attachments: MAPREDUCE-5044.v01.patch, MAPREDUCE-5044.v02.patch, MAPREDUCE-5044.v03.patch,
MAPREDUCE-5044.v04.patch, MAPREDUCE-5044.v05.patch, MAPREDUCE-5044.v06.patch, MAPREDUCE-5044.v07.local.patch,
Screen Shot 2013-11-12 at 1.05.32 PM.png, Screen Shot 2013-11-12 at 1.06.04 PM.png
> When an AM expires a task attempt it would be nice if it triggered a jstack output via
SIGQUIT before killing the task attempt.  This would be invaluable for helping users debug
their hung tasks, especially if they do not have shell access to the nodes.

This message was sent by Atlassian JIRA

View raw message