spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcelo Vanzin (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-20342) DAGScheduler sends SparkListenerTaskEnd before updating task's accumulators
Date Wed, 07 Jun 2017 16:50:18 GMT

    [ https://issues.apache.org/jira/browse/SPARK-20342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16041198#comment-16041198
] 

Marcelo Vanzin commented on SPARK-20342:
----------------------------------------

I have a fix for this, might as well use a variant of your test code for it...

> DAGScheduler sends SparkListenerTaskEnd before updating task's accumulators
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-20342
>                 URL: https://issues.apache.org/jira/browse/SPARK-20342
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 2.2.0
>            Reporter: Marcelo Vanzin
>
> Hit this on 2.2, but probably has been there forever. This is similar in spirit to SPARK-20205.
> Event is sent here, around L1154:
> {code}
>     listenerBus.post(SparkListenerTaskEnd(
>        stageId, task.stageAttemptId, taskType, event.reason, event.taskInfo, taskMetrics))
> {code}
> Accumulators are updated later, around L1173:
> {code}
>     val stage = stageIdToStage(task.stageId)
>     event.reason match {
>       case Success =>
>         task match {
>           case rt: ResultTask[_, _] =>
>             // Cast to ResultStage here because it's part of the ResultTask
>             // TODO Refactor this out to a function that accepts a ResultStage
>             val resultStage = stage.asInstanceOf[ResultStage]
>             resultStage.activeJob match {
>               case Some(job) =>
>                 if (!job.finished(rt.outputId)) {
>                   updateAccumulators(event)
> {code}
> Same thing applies here; UI shows correct info because it's pointing at the mutable {{TaskInfo}}
structure. But the event log, for example, may record the wrong information.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message