hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravi Prakash (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-3688) Need better Error message if AM is killed/throws exception
Date Tue, 19 Mar 2013 22:03:16 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-3688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13606901#comment-13606901

Ravi Prakash commented on MAPREDUCE-3688:

Hi Sandy! Did you think about this? Otherwise I'll be happy to take this back and work on
it in this week.

>From my testing on trunk, I notice that even for the case where the AM goes over container
limits (which I trigger with -Dyarn.app.mapreduce.am.resource.mb=512 -Dyarn.app.mapreduce.am.command-opts="-Xmx3500m"
on a sleep job), sometimes the error is propagated back and sometimes its not. Can you please
corroborate this? When State == FinalState == FAILED, the error is propagated back. However
about half the times, State == FINISHED and FinalState == KILLED, in which case there is no
message anywhere to help me. Not in the diagnostics, and there are no logs.

> Need better Error message if AM is killed/throws exception
> ----------------------------------------------------------
>                 Key: MAPREDUCE-3688
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3688
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mr-am, mrv2
>    Affects Versions: 0.23.1
>            Reporter: David Capwell
>            Assignee: Sandy Ryza
>             Fix For: 0.23.2
>         Attachments: mapreduce-3688-h0.23-v01.patch, mapreduce-3688-h0.23-v02.patch
> We need better error messages in the UI if the AM gets killed or throws an Exception.
> If the following error gets thrown: 
> java.lang.NumberFormatException: For input string: "9223372036854775807l" // last char
is an L
> then the UI should say this exception.  Instead I get the following:
> Application application_1326504761991_0018 failed 1 times due to AM Container for appattempt_1326504761991_0018_000001
> exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message