spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sital Kedia (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-21833) CoarseGrainedSchedulerBackend leaks executors in case of dynamic allocation
Date Thu, 24 Aug 2017 19:10:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-21833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sital Kedia updated SPARK-21833:
--------------------------------
    Description: 
We have seen this issue in coarse grained scheduler that in case of dynamic executor allocation
is turned on, the scheduler asks for more executors than needed. Consider the following situation
where there are excutor allocation manager is ramping down the number of executors. It will
lower the executor targer number by calling requestTotalExecutors api. 

Later,  when the allocation manager finds some executors to be idle, it will call killExecutor
api. The coarse grain scheduler, in the killExecutor function replaces the total executor
needed to current  + pending which overrides the earlier target set by the allocation manager.


This results in scheduler spawning more executors than actually needed. 

  was:
We have seen this issue in coarse grained scheduler that in case of dynamic executor allocation
is turned on, the scheduler asks for more executors than needed. Consider the following situation
where there are excutor allocation manager is ramping down the number of executors. It will
lower the executor targer number by calling requestTotalExecutors api (see https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L326.


Later,  when the allocation manager finds some executors to be idle, it will call killExecutor
api (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala#L447).
The coarse grain scheduler, in the killExecutor function replaces the total executor needed
to current  + pending which overrides the earlier target set by the allocation manager https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L523.


This results in scheduler spawning more executors than actually needed. 


> CoarseGrainedSchedulerBackend leaks executors in case of dynamic allocation
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-21833
>                 URL: https://issues.apache.org/jira/browse/SPARK-21833
>             Project: Spark
>          Issue Type: Bug
>          Components: Scheduler
>    Affects Versions: 2.2.0
>            Reporter: Sital Kedia
>
> We have seen this issue in coarse grained scheduler that in case of dynamic executor
allocation is turned on, the scheduler asks for more executors than needed. Consider the following
situation where there are excutor allocation manager is ramping down the number of executors.
It will lower the executor targer number by calling requestTotalExecutors api. 
> Later,  when the allocation manager finds some executors to be idle, it will call killExecutor
api. The coarse grain scheduler, in the killExecutor function replaces the total executor
needed to current  + pending which overrides the earlier target set by the allocation manager.

> This results in scheduler spawning more executors than actually needed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message