flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paris Carbone (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-3257) Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
Date Mon, 18 Jan 2016 19:11:39 GMT
Paris Carbone created FLINK-3257:
------------------------------------

             Summary: Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
                 Key: FLINK-3257
                 URL: https://issues.apache.org/jira/browse/FLINK-3257
             Project: Flink
          Issue Type: Improvement
            Reporter: Paris Carbone
            Assignee: Paris Carbone


The current snapshotting algorithm cannot support cycles in the execution graph. An alternative
scheme can potentially include records in-transit through the back-edges of a cyclic execution
graph (ABS [1]) to achieve the same guarantees.

One straightforward implementation of ABS for cyclic graphs can work as follows along the
lines:

1) Upon triggering a barrier in an IterationHead from the TaskManager start block output and
start upstream backup of all records forwarded from the respective IterationSink.

2) The IterationSink should eventually forward the current snapshotting epoch barrier to the
IterationSource.

3) Upon receiving a barrier from the IterationSink, the IterationSource should finalize the
snapshot, unblock its output and emit all records in-transit in FIFO order and continue the
usual execution.
--
Upon restart the IterationSource should emit all records from the injected snapshot first
and then continue its usual execution.

Several optimisations and slight variations can be potentially achieved but this can be the
initial implementation take.

[1] http://arxiv.org/abs/1506.08603



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message