flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhu Zhu (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-10240) Pluggable scheduling strategy for batch job
Date Fri, 14 Sep 2018 03:01:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-10240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Zhu Zhu updated FLINK-10240:
----------------------------
    Summary: Pluggable scheduling strategy for batch job  (was: Flexible scheduling strategy
is needed for batch job)

> Pluggable scheduling strategy for batch job
> -------------------------------------------
>
>                 Key: FLINK-10240
>                 URL: https://issues.apache.org/jira/browse/FLINK-10240
>             Project: Flink
>          Issue Type: New Feature
>          Components: Distributed Coordination
>            Reporter: Zhu Zhu
>            Priority: Major
>              Labels: scheduling
>
> Currently batch jobs are scheduled with LAZY_FROM_SOURCES strategy: source tasks are scheduled
in the beginning, and other tasks are scheduled once there input data are consumable.
> However, input data consumable does not always mean the task can work at once. 
>  
> One example is the hash join operation, where the operator first consumes one side(we
call it build side) to setup a table, then consumes the other side(we call it probe side)
to do the real join work. If the probe side is started early, it just get stuck on back pressure
as the join operator will not consume data from it before the building stage is done, causing
a waste of resources.
> If we have the probe side task started after the build stage is done, both the build
and probe side can have more computing resources as they are staggered.
>  
> That's why we think a flexible scheduling strategy is needed, allowing job owners to
customize the vertex schedule order and constraints. Better resource utilization usually
means better performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message