spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
Date Tue, 19 Dec 2017 13:52:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16296811#comment-16296811
] 

Thomas Graves commented on SPARK-22765:
---------------------------------------

[~CodingCat] with SPARK-21656 executors shouldn't timeout unless you are between stages. 


[~xuefuz] I would be interested if you could experiment to see if allocating all up front
helps.  If you can simply temporarily change, build, and try running.
Did you try a timeout of 0 or 1? Wondering if we handle that properly or we don't allow.

> Create a new executor allocation scheme based on that of MR
> -----------------------------------------------------------
>
>                 Key: SPARK-22765
>                 URL: https://issues.apache.org/jira/browse/SPARK-22765
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>    Affects Versions: 1.6.0
>            Reporter: Xuefu Zhang
>
> Many users migrating their workload from MR to Spark find a significant resource consumption
hike (i.e, SPARK-22683). While this might not be a concern for users that are more performance
centric, for others conscious about cost, such hike creates a migration obstacle. This situation
can get worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant environment.
With its performance-centric design, its inefficiency has also unfortunately shown up, especially
when compared with MR. Thus, it's believed that MR-styled scheduler still has its merit. Based
on our research, the inefficiency associated with dynamic allocation comes in many aspects
such as executor idling out, bigger executors, many stages (rather than 2 stages only in MR)
in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here is to add
a new, efficiency-centric  scheduling scheme based on that of MR. Such a MR-based scheme can
be further enhanced and be more adapted to Spark execution model. This alternative is expected
to offer good performance improvement (compared to MR) still with similar to or even better
efficiency than MR.
> Inputs are greatly welcome!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message