spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Or (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-6132) Context cleaner race condition across SparkContexts
Date Tue, 03 Mar 2015 08:55:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-6132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Or updated SPARK-6132:
-----------------------------
    Summary: Context cleaner race condition across SparkContexts  (was: Context cleaner thread
lives across SparkContexts)

> Context cleaner race condition across SparkContexts
> ---------------------------------------------------
>
>                 Key: SPARK-6132
>                 URL: https://issues.apache.org/jira/browse/SPARK-6132
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.0
>            Reporter: Andrew Or
>            Assignee: Andrew Or
>
> The context cleaner thread is not stopped properly. If a SparkContext is started immediately
after one stops, the context cleaner of the former can clean variables in the latter.
> This is because the cleaner.stop() just sets a flag and expects the thread to terminate
asynchronously, but the code to clean broadcasts goes through `SparkEnv.get.blockManager`,
which could belong to a different SparkContext. This is likely to be the cause of the `JavaAPISuite`,
which creates many back-to-back SparkContexts, being flaky.
> The right behavior is to wait until all currently running clean up tasks have finished.
> {code}
> java.io.IOException: org.apache.spark.SparkException: Failed to get broadcast_0_piece0
of broadcast_0
>         at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1180)
>         at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:164)
>         at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
>         at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
>         ...
> Caused by: org.apache.spark.SparkException: Failed to get broadcast_0_piece0 of broadcast_0
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$org$apache$spark$broadcast$TorrentBroadcast$$readBlocks$1$$anonfun$2.apply(TorrentBroadcast.scala:137)
>         at scala.Option.getOrElse(Option.scala:120)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message