spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apache Spark (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (SPARK-14649) DagScheduler runs duplicate tasks on fetch failure
Date Fri, 15 Apr 2016 23:53:26 GMT

     [ https://issues.apache.org/jira/browse/SPARK-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apache Spark reassigned SPARK-14649:
------------------------------------

    Assignee:     (was: Apache Spark)

> DagScheduler runs duplicate tasks on fetch failure
> --------------------------------------------------
>
>                 Key: SPARK-14649
>                 URL: https://issues.apache.org/jira/browse/SPARK-14649
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Sital Kedia
>
> When running a job we found out that there are many duplicate tasks running after fetch
failure in a stage. The issue is that when submitting tasks for a stage, the dag scheduler
submits all the pending tasks (tasks whose output is not available). But out of those pending
tasks, some tasks might already be running on the cluster. The dag scheduler need to submit
only non-running tasks for a stage. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message