flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Flink Jira Bot (Jira)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-15208) client submits multiple sub-jobs for job with dynamic catalog table
Date Fri, 16 Apr 2021 11:03:02 GMT

     [ https://issues.apache.org/jira/browse/FLINK-15208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Flink Jira Bot updated FLINK-15208:
-----------------------------------
    Labels: stale-assigned  (was: )

> client submits multiple sub-jobs for job with dynamic catalog table
> -------------------------------------------------------------------
>
>                 Key: FLINK-15208
>                 URL: https://issues.apache.org/jira/browse/FLINK-15208
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API, Table SQL / Client
>            Reporter: Bowen Li
>            Assignee: Bowen Li
>            Priority: Major
>              Labels: stale-assigned
>
> with dynamic catalog table in FLINK-15206, users can maintain a single SQL job for both
their online and offline job. However, they still need to change their configurations in order
to submit different jobs over time.
> E.g. when users update logic of their streaming job, they need to bootstrap both a new
online job and backfill offline job, let's call them sub-jobs of a job with dynamic catalog
table. They would have to 
> 1) manually change execution mode in yaml config to "streaming", execute the sql and
submit the streaming job 
> 2) manually change execution mode in yaml config to "batch", execute the sql and submit
the batch job
> we should introduce a mechanism to allow users submit all or a subset of sub-jobs all
at once. In the backfill use case mentioned above, ideally users should just execute the SQL
once, and Flink should spin up two jobs for our users. 
> Streaming platforms at some big companies like Uber and Netflix are already kind of doing
this for backfill use cases one way or another - some do it in UI, some do it in planning
phase. Would be great to standardize this practice and provide users with ultimate simplicity.
> The assumption here is that users are fully aware of the consequences of launching two/multiple
jobs at the same time. E.g. they need to handle overlapped results if there's any.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message