flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gyula Fora (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-2390) Replace iteration timeout with algorithm for detecting termination
Date Tue, 21 Jul 2015 20:24:05 GMT
Gyula Fora created FLINK-2390:

             Summary: Replace iteration timeout with algorithm for detecting termination
                 Key: FLINK-2390
                 URL: https://issues.apache.org/jira/browse/FLINK-2390
             Project: Flink
          Issue Type: New Feature
          Components: Streaming
            Reporter: Gyula Fora
             Fix For: 0.10

Currently the user can set a timeout which will shut down the iteration source/sink nodes
if no new data is received during that time to allow program termination in iterative streaming

This method is used due to the non-trivial nature of termination in iterative streaming jobs.
While termination is not a main concern in long running streaming jobs, this behaviour makes
iterative tests non-deterministic and they often fail on travis due to the timeout. Also setting
a timeout can cause jobs to terminate prematurely.

I propose to remove iteration timeouts and replace it with the following algorithm for detecting

-We first identify loop edges in the jobgraph (the channels from the iteration sources to
the head operators)
-Once the head operators (the ones with loop input) finish with all their non-loop inputs
they broadcast a marker to their outputs.
-Each operator will broadcast a marker once it received a marker from all its non-finished
-Iteration sources are terminated when they receive 2 consecutive markers without receiving
any record in-between

The idea behind the algorithm is to find out when no more outputs are generated from the operators
inside an iteration after their normal inputs are finished.

This message was sent by Atlassian JIRA

View raw message