spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Boisvert (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-1304) Job fails with spot instances (due to IllegalStateException: Shutdown in progress)
Date Mon, 02 Mar 2015 17:27:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14343418#comment-14343418
] 

Alex Boisvert commented on SPARK-1304:
--------------------------------------

I see how it's similar to SPARK-6014 but the code paths seem different enough to keep two
different issues open, as I suspect the resolution would be different for each.  If you disagree
and can explain how a single fix can address both, I'd be happy to close this one as duplicate.

> Job fails with spot instances (due to IllegalStateException: Shutdown in progress)
> ----------------------------------------------------------------------------------
>
>                 Key: SPARK-1304
>                 URL: https://issues.apache.org/jira/browse/SPARK-1304
>             Project: Spark
>          Issue Type: Bug
>          Components: EC2
>    Affects Versions: 0.9.0
>            Reporter: Alex Boisvert
>            Priority: Minor
>
> We had a job running smoothly with spot instances until one of the spot instances got
terminated ... which led to a series of "IllegalStateException: Shutdown in progress" and
the job failed afterwards.
> 14/03/24 06:07:52 WARN scheduler.TaskSetManager: Loss was due to java.lang.IllegalStateException
> java.lang.IllegalStateException: Shutdown in progress
> 	at java.lang.ApplicationShutdownHooks.add(ApplicationShutdownHooks.java:66)
> 	at java.lang.Runtime.addShutdownHook(Runtime.java:211)
> 	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1441)
> 	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:256)
> 	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
> 	at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:77)
> 	at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:51)
> 	at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:156)
> 	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:149)
> 	at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:64)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
> 	at org.apache.spark.rdd.MappedRDD.compute(MappedRDD.scala:31)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
> 	at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:90)
> 	at org.apache.spark.rdd.CoalescedRDD$$anonfun$compute$1.apply(CoalescedRDD.scala:89)
> 	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
> 	at org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:57)
> 	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:95)
> 	at org.apache.spark.rdd.PairRDDFunctions$$anonfun$1.apply(PairRDDFunctions.scala:94)
> 	at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:471)
> 	at org.apache.spark.rdd.RDD$$anonfun$3.apply(RDD.scala:471)
> 	at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:34)
> 	at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:241)
> 	at org.apache.spark.rdd.RDD.iterator(RDD.scala:232)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:161)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:102)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:53)
> 	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$run$1.apply$mcV$sp(Executor.scala:213)
> 	at org.apache.spark.deploy.SparkHadoopUtil.runAsUser(SparkHadoopUtil.scala:49)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:178)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:724)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message