spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "KaiXinXIaoLei (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-14228) Lost executor of RPC disassociated, and occurs exception: Could not find CoarseGrainedScheduler or it has been stopped
Date Mon, 11 Dec 2017 13:45:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-14228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16285920#comment-16285920
] 

KaiXinXIaoLei commented on SPARK-14228:
---------------------------------------

Using this patch, this problem is still exists.

> Lost executor of RPC disassociated, and occurs exception: Could not find CoarseGrainedScheduler
or it has been stopped
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-14228
>                 URL: https://issues.apache.org/jira/browse/SPARK-14228
>             Project: Spark
>          Issue Type: Bug
>            Reporter: meiyoula
>             Fix For: 2.3.0
>
>
> When I start 1000 executors, and then stop the process. It will call SparkContext.stop
to stop all executors. But during this process, the executors has been killed will lost of
rpc with driver, and try to reviveOffers, but can't find CoarseGrainedScheduler or it has
been stopped.
> {quote}
> 16/03/29 01:45:45 ERROR YarnScheduler: Lost executor 610 on 51-196-152-8: remote Rpc
client disassociated
> 16/03/29 01:45:45 ERROR Inbox: Ignoring error
> org.apache.spark.SparkException: Could not find CoarseGrainedScheduler or it has been
stopped.
> 	at org.apache.spark.rpc.netty.Dispatcher.postMessage(Dispatcher.scala:161)
> 	at org.apache.spark.rpc.netty.Dispatcher.postOneWayMessage(Dispatcher.scala:131)
> 	at org.apache.spark.rpc.netty.NettyRpcEnv.send(NettyRpcEnv.scala:173)
> 	at org.apache.spark.rpc.netty.NettyRpcEndpointRef.send(NettyRpcEnv.scala:398)
> 	at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend.reviveOffers(CoarseGrainedSchedulerBackend.scala:314)
> 	at org.apache.spark.scheduler.TaskSchedulerImpl.executorLost(TaskSchedulerImpl.scala:482)
> 	at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.removeExecutor(CoarseGrainedSchedulerBackend.scala:261)
> 	at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
> 	at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint$$anonfun$onDisconnected$1.apply(CoarseGrainedSchedulerBackend.scala:207)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.scheduler.cluster.CoarseGrainedSchedulerBackend$DriverEndpoint.onDisconnected(CoarseGrainedSchedulerBackend.scala:207)
> 	at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:144)
> 	at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
> 	at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:102)
> 	at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> 	at java.lang.Thread.run(Thread.java:745)
> {quote}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message