spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Or (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-9193) Avoid assigning tasks to executors under killing
Date Tue, 21 Jul 2015 03:49:04 GMT

     [ https://issues.apache.org/jira/browse/SPARK-9193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Andrew Or updated SPARK-9193:
-----------------------------
    Assignee: Jie Huang

> Avoid assigning tasks to executors under killing
> ------------------------------------------------
>
>                 Key: SPARK-9193
>                 URL: https://issues.apache.org/jira/browse/SPARK-9193
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 1.4.0, 1.4.1
>            Reporter: Jie Huang
>            Assignee: Jie Huang
>
> Now, when some executors are killed by dynamic-allocation, it leads to some mis-assignment
onto lost executors sometimes. Such kind of mis-assignment causes task failure(s) or even
job failure if it repeats that errors for 4 times.
> The root cause is that killExecutors doesn't remove those executors under killing ASAP.
It depends on the OnDisassociated event to refresh the active working list later. The delay
time really depends on your cluster status (from several milliseconds to sub-minute). When
new tasks to be scheduled during that period of time, it will be assigned to those "active"
but "under killing" executors. Then the tasks will be failed due to "executor lost". The better
way is to exclude those executors under killing in the makeOffers(). Then all those tasks
won't be allocated onto those executors "to be lost" any more.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message