spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20178) Improve Scheduler fetch failures
Date Mon, 22 May 2017 20:30:04 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16020141#comment-16020141
] 

Josh Rosen commented on SPARK-20178:
------------------------------------

Sure, let me clarify:

* When a FetchFailure occurs, the DAGScheduler receives a fetch failure message of the form
{{FetchFailed(bmAddress, shuffleId, mapId, reduceId, failureMessage)}}.
* As of today's Spark master branch, the DAGScheduler handles this failure by marking that
individual output as unavailable ( and by marking all outputs on that executor as unavailable.

**
 See https://github.com/apache/spark/blob/9b09101938399a3490c3c9bde9e5f07031140fdf/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1339
and https://github.com/apache/spark/blob/9b09101938399a3490c3c9bde9e5f07031140fdf/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L1346.
** As a shorthand, let's call this {{remove(shuffleId, mapId)}} followed by {{remove(blockManagerId)}}.

> Improve Scheduler fetch failures
> --------------------------------
>
>                 Key: SPARK-20178
>                 URL: https://issues.apache.org/jira/browse/SPARK-20178
>             Project: Spark
>          Issue Type: Epic
>          Components: Scheduler
>    Affects Versions: 2.1.0
>            Reporter: Thomas Graves
>
> We have been having a lot of discussions around improving the handling of fetch failures.
 There are 4 jira currently related to this.  
> We should try to get a list of things we want to improve and come up with one cohesive
design.
> SPARK-20163,  SPARK-20091,  SPARK-14649 , and SPARK-19753
> I will put my initial thoughts in a follow on comment.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message