spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Saisai Shao (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-24615) Accelerator aware task scheduling for Spark
Date Thu, 21 Jun 2018 08:30:00 GMT
Saisai Shao created SPARK-24615:
-----------------------------------

             Summary: Accelerator aware task scheduling for Spark
                 Key: SPARK-24615
                 URL: https://issues.apache.org/jira/browse/SPARK-24615
             Project: Spark
          Issue Type: Improvement
          Components: Spark Core
    Affects Versions: 2.4.0
            Reporter: Saisai Shao


In the machine learning area, accelerator card (GPU, FPGA, TPU) is predominant compared to
CPUs. To make the current Spark architecture to work with accelerator cards, Spark itself
should understand the existence of accelerators and know how to schedule task onto the executors
where accelerators are equipped.

Current Spark’s scheduler schedules tasks based on the locality of the data plus the available
of CPUs. This will introduce some problems when scheduling tasks with accelerators required.
 # CPU cores are usually more than accelerators on one node, using CPU cores to schedule accelerator
required tasks will introduce the mismatch.
 # In one cluster, we always assume that CPU is equipped in each node, but this is not true
of accelerator cards.
 # The existence of heterogeneous tasks (accelerator required or not) requires scheduler to
schedule tasks with a smart way.

So here propose to improve the current scheduler to support heterogeneous tasks (accelerator
requires or not). Details is attached in google doc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message