spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "gao (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (SPARK-24342) Large Task prior scheduling to Reduce overall execution time
Date Tue, 22 May 2018 09:38:00 GMT

     [ https://issues.apache.org/jira/browse/SPARK-24342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

gao updated SPARK-24342:
------------------------
    Description: 
When performing a set of concurrent tasks, if the relatively large task (long-time task) performs
the first small-task execution, the overall execution time 
can be shortened.
Therefore, Spark needs to add a new function to perform Large Task of a group of task set
prior scheduling and small tasks after scheduling
   The time span of the task can be identified based on the historical execution time. We
can think that the task with a long execution time will longe in 
future. Record the last task execution time together with the task's key as a log file, and
load the log file at the next execution time. use The 
RangePartitioner and partitioning partitioning methods prioritize large tasks and can achieve
concurrent task optimization.

> Large Task prior scheduling to Reduce overall execution time
> ------------------------------------------------------------
>
>                 Key: SPARK-24342
>                 URL: https://issues.apache.org/jira/browse/SPARK-24342
>             Project: Spark
>          Issue Type: Improvement
>          Components: Optimizer
>    Affects Versions: 2.3.0
>            Reporter: gao
>            Priority: Minor
>
> When performing a set of concurrent tasks, if the relatively large task (long-time task)
performs the first small-task execution, the overall execution time 
> can be shortened.
> Therefore, Spark needs to add a new function to perform Large Task of a group of task
set prior scheduling and small tasks after scheduling
>    The time span of the task can be identified based on the historical execution time.
We can think that the task with a long execution time will longe in 
> future. Record the last task execution time together with the task's key as a log file,
and load the log file at the next execution time. use The 
> RangePartitioner and partitioning partitioning methods prioritize large tasks and can
achieve concurrent task optimization.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message