flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aljoscha Krettek (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-2390) Replace iteration timeout with algorithm for detecting termination
Date Mon, 03 Apr 2017 12:21:41 GMT

     [ https://issues.apache.org/jira/browse/FLINK-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Aljoscha Krettek updated FLINK-2390:
    Component/s:     (was: Streaming)
                 DataStream API

> Replace iteration timeout with algorithm for detecting termination
> ------------------------------------------------------------------
>                 Key: FLINK-2390
>                 URL: https://issues.apache.org/jira/browse/FLINK-2390
>             Project: Flink
>          Issue Type: New Feature
>          Components: DataStream API
>            Reporter: Gyula Fora
>             Fix For: 1.0.0
> Currently the user can set a timeout which will shut down the iteration source/sink nodes
if no new data is received during that time to allow program termination in iterative streaming
> This method is used due to the non-trivial nature of termination in iterative streaming
jobs. While termination is not a main concern in long running streaming jobs, this behaviour
makes iterative tests non-deterministic and they often fail on travis due to the timeout.
Also setting a timeout can cause jobs to terminate prematurely.
> I propose to remove iteration timeouts and replace it with the following algorithm for
detecting termination:
> -We first identify loop edges in the jobgraph (the channels from the iteration sources
to the head operators)
> -Once the head operators (the ones with loop input) finish with all their non-loop inputs
they broadcast a marker to their outputs.
> -Each operator will broadcast a marker once it received a marker from all its non-finished
> -Iteration sources are terminated when they receive 2 consecutive markers without receiving
any record in-between
> The idea behind the algorithm is to find out when no more outputs are generated from
the operators inside an iteration after their normal inputs are finished.

This message was sent by Atlassian JIRA

View raw message