flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-3257) Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
Date Mon, 27 Mar 2017 11:42:41 GMT

    [ https://issues.apache.org/jira/browse/FLINK-3257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15943105#comment-15943105

ASF GitHub Bot commented on FLINK-3257:

Github user senorcarbone commented on the issue:

    Thanks for the review @gyfora and @StephanEwen , these are very good points.
    @StephanEwen makes sense to not really index/keep metadata of individual records in log
slices, it is extra overhead. Writing raw operator state makes sense, so I will do that once
@StefanRRichter  gives me some pointers, that would be great. 
    Any redistribution of the checkpoint slices would violate causality so I hope the "list
redistribution pattern" actually keeps the set of registered operator states per instance
intact. The garbage collection issue still remains but maybe (if @StefanRRichter approves)
I can add an `unregister` functionality to the `OperatorStateStore`.
    I can also add preconfigured operators (not that they will be reused anywhere). It is
more clean but I really need to see how can I get full control of the `task` checkpointing
behaviour from the `operator` level (since the default task checkpointing behaviour is altered
at the task-level).

> Add Exactly-Once Processing Guarantees in Iterative DataStream Jobs
> -------------------------------------------------------------------
>                 Key: FLINK-3257
>                 URL: https://issues.apache.org/jira/browse/FLINK-3257
>             Project: Flink
>          Issue Type: Improvement
>          Components: DataStream API
>            Reporter: Paris Carbone
>            Assignee: Paris Carbone
> The current snapshotting algorithm cannot support cycles in the execution graph. An alternative
scheme can potentially include records in-transit through the back-edges of a cyclic execution
graph (ABS [1]) to achieve the same guarantees.
> One straightforward implementation of ABS for cyclic graphs can work as follows along
the lines:
> 1) Upon triggering a barrier in an IterationHead from the TaskManager start block output
and start upstream backup of all records forwarded from the respective IterationSink.
> 2) The IterationSink should eventually forward the current snapshotting epoch barrier
to the IterationSource.
> 3) Upon receiving a barrier from the IterationSink, the IterationSource should finalize
the snapshot, unblock its output and emit all records in-transit in FIFO order and continue
the usual execution.
> --
> Upon restart the IterationSource should emit all records from the injected snapshot first
and then continue its usual execution.
> Several optimisations and slight variations can be potentially achieved but this can
be the initial implementation take.
> [1] http://arxiv.org/abs/1506.08603

This message was sent by Atlassian JIRA

View raw message