hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sandy Ryza (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3688) Need better Error message if AM is killed/throws exception
Date Tue, 19 Mar 2013 22:27:17 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606941#comment-13606941

Sandy Ryza commented on MAPREDUCE-3688:

Hi Ravi,

I haven't come across the issue that you mentioned, i.e. I've gotten the proper diagnostic
message when the NM kills a container for going over resource limits, but my testing has only
been limited.  Sounds like some sort of bug with the NM state machine?

The part I've been looking into is related to Koji's work, making it that any errors that
containers spit out to stdout/stderr on startup get added to the diagnostics.

As the focus of this JIRA has gone between a few related but separate issues, my opinion is
at this point it makes most sense to file new JIRAs (or subtasks?) for the specific changes
we want to make.  Does me working on picking up the logs and you working on the over-resource-limits
message work for you?
> Need better Error message if AM is killed/throws exception
> ----------------------------------------------------------
>                 Key: MAPREDUCE-3688
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.1
>            Reporter: David Capwell
>            Assignee: Sandy Ryza
>             Fix For: 0.23.2
>         Attachments: mapreduce-3688-h0.23-v01.patch, mapreduce-3688-h0.23-v02.patch
> We need better error messages in the UI if the AM gets killed or throws an Exception.
> If the following error gets thrown: 
> java.lang.NumberFormatException: For input string: "9223372036854775807l" // last char
is an L
> then the UI should say this exception.  Instead I get the following:
> Application application_1326504761991_0018 failed 1 times due to AM Container for appattempt_1326504761991_0018_000001
> exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message