spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "peay (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-24523) InterruptedException when closing SparkContext
Date Wed, 10 Oct 2018 08:02:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-24523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644602#comment-16644602
] 

peay commented on SPARK-24523:
------------------------------

I've started hitting the exact same error since I upgraded to EMR 5.17. Been running Spark
2.3.1 for a while before without issues, though, so I still suspect something EMR specific.
In spite of this, AWS premium support has not been very helpful and mostly blames Spark itself.

This occurs only for one particular job, which has a rather large computation graph with a
union of a couple dozens of datasets. For this job, it occurs 100% of the time. I don't see
any warning regarding events being dropped, but it seems that OP managed to get rid of those
warnings although the original issue still persists, which seems to be our case.

Manually closing the Spark context does seem to work for us. Adding more driver cores did
not help. Attaching a [^thread-dump.log] from slightly before the crash in case that's
useful.

> InterruptedException when closing SparkContext
> ----------------------------------------------
>
>                 Key: SPARK-24523
>                 URL: https://issues.apache.org/jira/browse/SPARK-24523
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.3.0, 2.3.1
>         Environment: EMR 5.14.0, S3/HDFS inputs and outputs; EMR 5.17
>  
>  
>  
>            Reporter: Umayr Hassan
>            Priority: Major
>         Attachments: spark-stop-jstack.log.1, spark-stop-jstack.log.2, spark-stop-jstack.log.3,
thread-dump.log
>
>
> I'm running a Scala application in EMR with the following properties:
> {{--master yarn --deploy-mode cluster --driver-memory 13g --executor-memory 30g --executor-cores
5 --conf spark.default.parallelism=400 --conf spark.dynamicAllocation.enabled=true --conf
spark.dynamicAllocation.maxExecutors=20 --conf spark.eventLog.dir=hdfs:///var/log/spark/apps
--conf spark.eventLog.enabled=true --conf spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version=2
--conf spark.scheduler.listenerbus.eventqueue.capacity=20000 --conf spark.shuffle.service.enabled=true
--conf spark.sql.shuffle.partitions=400 --conf spark.yarn.maxAppAttempts=1}}
> The application runs fine till SparkContext is (automatically) closed, at which point
the SparkContext object throws. 
> {{18/06/10 10:44:43 ERROR Utils: Uncaught exception in thread pool-4-thread-1 java.lang.InterruptedException
at java.lang.Object.wait(Native Method) at java.lang.Thread.join(Thread.java:1252) at java.lang.Thread.join(Thread.java:1326)
at org.apache.spark.scheduler.AsyncEventQueue.stop(AsyncEventQueue.scala:133) at org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219)
at org.apache.spark.scheduler.LiveListenerBus$$anonfun$stop$1.apply(LiveListenerBus.scala:219)
at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at scala.collection.IterableLike$class.foreach(IterableLike.scala:72) at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
at org.apache.spark.scheduler.LiveListenerBus.stop(LiveListenerBus.scala:219) at org.apache.spark.SparkContext$$anonfun$stop$6.apply$mcV$sp(SparkContext.scala:1915)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1357) at org.apache.spark.SparkContext.stop(SparkContext.scala:1914)
at org.apache.spark.SparkContext$$anonfun$2.apply$mcV$sp(SparkContext.scala:572) at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:216)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1988) at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192) at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)}}
>  
> I've not seen this behavior in Spark 2.0.2 and Spark 2.2.0 (for the same application),
so I'm not sure which change is causing Spark 2.3 to throw. Any ideas?
> best,
> Umayr



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message