spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-24182) Improve error message for client mode when AM fails
Date Fri, 04 May 2018 23:13:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-24182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-24182:
------------------------------------

    Assignee:     (was: Apache Spark)

> Improve error message for client mode when AM fails
> ---------------------------------------------------
>
>                 Key: SPARK-24182
>                 URL: https://issues.apache.org/jira/browse/SPARK-24182
>             Project: Spark
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 2.3.0
>            Reporter: Marcelo Vanzin
>            Priority: Minor
>
> Today, when the client AM fails, there's not a lot of useful information printed on the
output. Depending on the type of failure, the information provided by the YARN AM is also
not very useful. For example, you'd see this in the Spark shell:
> {noformat}
> 18/05/04 11:07:38 ERROR spark.SparkContext: Error initializing SparkContext.
> org.apache.spark.SparkException: Yarn application has already ended! It might have been
killed or unable to launch application master.
>         at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.waitForApplication(YarnClientSchedulerBackend.scala:86)
>         at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:63)
>         at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
>         at org.apache.spark.SparkContext.<init>(SparkContext.scala:500)
>  [long stack trace]
> {noformat}
> Similarly, on the YARN RM, for certain failures you see a generic error like this:
> {noformat}
> ExitCodeException exitCode=10: at org.apache.hadoop.util.Shell.runCommand(Shell.java:543)
at org.apache.hadoop.util.Shell.run(Shell.java:460) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:720)
at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:366)
at 
> [blah blah blah]
> {noformat}
> It would be nice if we could provide a more accurate description of what went wrong when
possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message