spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ajith S (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-23626) Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
Date Thu, 08 Mar 2018 06:29:00 GMT
Ajith S created SPARK-23626:
-------------------------------

             Summary: Spark DAGScheduler scheduling performance hindered on JobSubmitted Event
                 Key: SPARK-23626
                 URL: https://issues.apache.org/jira/browse/SPARK-23626
             Project: Spark
          Issue Type: Bug
          Components: Scheduler
    Affects Versions: 2.2.1
            Reporter: Ajith S


DAGScheduler becomes a bottleneck in cluster when multiple JobSubmitted events has to be processed
as DAGSchedulerEventProcessLoop is single threaded and it will block other tasks in queue
like TaskCompletion.

The JobSubmitted event is time consuming depending on the nature of the job (Example: calculating
parent stage dependencies, shuffle dependencies, partitions) and thus it blocks all the events
to be processed.

 

I see multiple JIRA referring to this behavior

https://issues.apache.org/jira/browse/SPARK-2647

https://issues.apache.org/jira/browse/SPARK-4961

 

Similarly in my cluster some jobs partition calculation is time consuming (Similar to stack
at SPARK-2647) hence it slows down the spark DAGSchedulerEventProcessLoop which results in
user jobs to slowdown, even if its tasks are finished within seconds, as TaskCompletion Events
are processed at a slower rate due to blockage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message