spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <>
Subject [jira] [Commented] (SPARK-20178) Improve Scheduler fetch failures
Date Mon, 22 May 2017 20:30:04 GMT


Josh Rosen commented on SPARK-20178:

Sure, let me clarify:

* When a FetchFailure occurs, the DAGScheduler receives a fetch failure message of the form
{{FetchFailed(bmAddress, shuffleId, mapId, reduceId, failureMessage)}}.
* As of today's Spark master branch, the DAGScheduler handles this failure by marking that
individual output as unavailable ( and by marking all outputs on that executor as unavailable.

** As a shorthand, let's call this {{remove(shuffleId, mapId)}} followed by {{remove(blockManagerId)}}.

> Improve Scheduler fetch failures
> --------------------------------
>                 Key: SPARK-20178
>                 URL:
>             Project: Spark
>          Issue Type: Epic
>          Components: Scheduler
>    Affects Versions: 2.1.0
>            Reporter: Thomas Graves
> We have been having a lot of discussions around improving the handling of fetch failures.
 There are 4 jira currently related to this.  
> We should try to get a list of things we want to improve and come up with one cohesive
> SPARK-20163,  SPARK-20091,  SPARK-14649 , and SPARK-19753
> I will put my initial thoughts in a follow on comment.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message