spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Saisai Shao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-21082) Consider Executor's memory usage when scheduling task
Date Thu, 15 Jun 2017 05:40:01 GMT

    [ https://issues.apache.org/jira/browse/SPARK-21082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16050016#comment-16050016
] 

Saisai Shao commented on SPARK-21082:
-------------------------------------

That's fine if the storage memory is not enough to cache all the data, Spark still could handle
this scenario without OOM. Base on the free memory to schedule the task is too scenario specific
from my understanding.

[~tgraves] [~irashid] [~mridulm80] may have more thoughts on it. 

> Consider Executor's memory usage when scheduling task 
> ------------------------------------------------------
>
>                 Key: SPARK-21082
>                 URL: https://issues.apache.org/jira/browse/SPARK-21082
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler, Spark Core
>    Affects Versions: 2.3.0
>            Reporter: DjvuLee
>
>  Spark Scheduler do not consider the memory usage during dispatch tasks, this can lead
to Executor OOM if the RDD is cached sometimes, because Spark can not estimate the memory
usage well enough(especially when the RDD type is not flatten), scheduler may dispatch so
many tasks on one Executor.
> We can offer a configuration for user to decide whether scheduler will consider the memory
usage.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message