spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "holdenk (JIRA)" <>
Subject [jira] [Commented] (SPARK-20087) Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners
Date Mon, 05 Feb 2018 21:56:00 GMT


holdenk commented on SPARK-20087:

I've given up on changing the accumulator API until Spark 3+ is happening. I know some people had
strong feelings about these APIs last time we looked at them but I don't think my view is
going to be the gating factor on any changes here.

> Include accumulators / taskMetrics when sending TaskKilled to onTaskEnd listeners
> ---------------------------------------------------------------------------------
>                 Key: SPARK-20087
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 2.1.0
>            Reporter: Charles Lewis
>            Priority: Major
> When tasks end due to an ExceptionFailure, subscribers to onTaskEnd receive accumulators
/ task metrics for that task, if they were still available. These metrics are not currently
sent when tasks are killed intentionally, such as when a speculative retry finishes, and the
original is killed (or vice versa). Since we're killing these tasks ourselves, these metrics
should almost always exist, and we should treat them the same way as we treat ExceptionFailures.
> Sending these metrics with the TaskKilled end reason makes aggregation across all tasks
in an app more accurate. This data can inform decisions about how to tune the speculation
parameters in order to minimize duplicated work, and in general, the total cost of an app
should include both successful and failed tasks, if that information exists.
> PR:

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message