spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-17485) Failed remote cached block reads can lead to whole job failure
Date Wed, 21 Sep 2016 21:30:20 GMT

     [ https://issues.apache.org/jira/browse/SPARK-17485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-17485:
------------------------------------

    Assignee: Apache Spark  (was: Josh Rosen)

> Failed remote cached block reads can lead to whole job failure
> --------------------------------------------------------------
>
>                 Key: SPARK-17485
>                 URL: https://issues.apache.org/jira/browse/SPARK-17485
>             Project: Spark
>          Issue Type: Improvement
>          Components: Block Manager
>    Affects Versions: 1.6.2, 2.0.0
>            Reporter: Josh Rosen
>            Assignee: Apache Spark
>            Priority: Critical
>             Fix For: 2.0.1, 2.1.0
>
>
> In Spark's RDD.getOrCompute we first try to read a local copy of a cached block, then
a remote copy, and only fall back to recomputing the block if no cached copy (local or remote)
can be read. This logic works correctly in the case where no remote copies of the block exist,
but if there _are_ remote copies but reads of those copies fail (due to network issues or
internal Spark bugs) then the BlockManager will throw a {{BlockFetchException}} error that
fails the entire job.
> In the case of torrent broadcast we really _do_ want to fail the entire job in case no
remote blocks can be fetched, but this logic is inappropriate for cached blocks because those
can/should be recomputed.
> Therefore, I think that this exception should be thrown higher up the call stack by the
BlockManager client code and not the block manager itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message