spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Rosen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (SPARK-17637) Packed scheduling for Spark tasks across executors
Date Wed, 01 May 2019 14:12:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-17637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831014#comment-16831014
] 

Josh Rosen commented on SPARK-17637:
------------------------------------

I think this old feature suggestion is still very relevant and I'd love to see someone pick
this up and rework it for a newer version of Spark (or come up with an alternative improvement).

Here's my use-case for packed scheduling of tasks:

I have a job shaped roughly like {{input.flatMap(parse).repartition(40).map(transform).write.parquet()}}, where
I have ~1000 map tasks but only 40 reduce tasks (because I want to write exactly 40 output
files) and those reduce tasks are single-core CPU bound and don't require much memory (so
the tasks aren't really able to take advantage of idle resources from unused task slots on their executors).

I want to run the initial map phase with huge parallelism (because my input dataset is massive)
and then want to scale down resource usage during the final reduce phase such that I minimize
the number of idle task slots.

I'm currently using dynamic allocation with Spark on YARN with the external shuffle service
enabled. What happens today is that the final reduce phase's tasks get round-robin scheduled
across the executors, spreading work thin and leaving most of the executor cores idle. If
these tasks were instead densely-packed then I'd be able to scale down all but a couple of
my executors, freeing resources for other YARN applications. When the current current round-robin
policy is coupled with multi-core executors, it can sometimes cause this shape of workload
to be less resource efficient (in terms of vcore_seconds) than an equivalent Scalding / Hadoop
MapReduce job.

As a short-term workaround, I can reconfigure my application to run a larger number of smaller
executors, but that's somewhat painful compared to being able to flip a single flag to toggle
the task scheduling policy.

> Packed scheduling for Spark tasks across executors
> --------------------------------------------------
>
>                 Key: SPARK-17637
>                 URL: https://issues.apache.org/jira/browse/SPARK-17637
>             Project: Spark
>          Issue Type: Improvement
>          Components: Scheduler
>            Reporter: Zhan Zhang
>            Assignee: Zhan Zhang
>            Priority: Minor
>
> Currently Spark scheduler implements round robin scheduling for tasks to executors. Which
is great as it distributes the load evenly across the cluster, but this leads to significant
resource waste in some cases, especially when dynamic allocation is enabled.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message