spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Xuefu Zhang (JIRA)" <>
Subject [jira] [Commented] (SPARK-22765) Create a new executor allocation scheme based on that of MR
Date Fri, 15 Dec 2017 05:30:04 GMT


Xuefu Zhang commented on SPARK-22765:

[~tgraves], I think it would help if SPARK-21656 can make a close-to-zero idle time work.
This is one source of inefficiency. Our version is too old to backport the fix, but will try
out this when we upgrade.

The second source of inefficiency comes in the fact that Spark favors bigger containers. A
4-core container might be running one task while wasting the other cores/mem. The executor
cannot die as long as there is one task running. One might argue that a user configures 1-core
containers under dynamic allocation. but this is probably not optimal on other aspects.

The third reason that one might favor MR-styled scheduling is its simplicity and efficiency.
Frequently we found that for heavy workload the scheduler cannot really keep up with the task
ups and downs, especially when the tasks finish fast. 

For cost-conscious users, cluster-level resource efficiency is probably what's looked at.
My suspicion is that an enhanced MR-styled scheduling, simple and performing, will be significantly
improve resource efficiency than a typical use of dynamic allocation, without sacrificing
much performance.

As a start point, we will first benchmark with SPARK-21656 when possible.

> Create a new executor allocation scheme based on that of MR
> -----------------------------------------------------------
>                 Key: SPARK-22765
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>    Affects Versions: 1.6.0
>            Reporter: Xuefu Zhang
> Many users migrating their workload from MR to Spark find a significant resource consumption
hike (i.e, SPARK-22683). While this might not be a concern for users that are more performance
centric, for others conscious about cost, such hike creates a migration obstacle. This situation
can get worse as more users are moving to cloud.
> Dynamic allocation make it possible for Spark to be deployed in multi-tenant environment.
With its performance-centric design, its inefficiency has also unfortunately shown up, especially
when compared with MR. Thus, it's believed that MR-styled scheduler still has its merit. Based
on our research, the inefficiency associated with dynamic allocation comes in many aspects
such as executor idling out, bigger executors, many stages (rather than 2 stages only in MR)
in a spark job, etc.
> Rather than fine tuning dynamic allocation for efficiency, the proposal here is to add
a new, efficiency-centric  scheduling scheme based on that of MR. Such a MR-based scheme can
be further enhanced and be more adapted to Spark execution model. This alternative is expected
to offer good performance improvement (compared to MR) still with similar to or even better
efficiency than MR.
> Inputs are greatly welcome!

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message