flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6315) Notify on checkpoint timeout
Date Thu, 20 Apr 2017 09:31:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15976368#comment-15976368
] 

Stephan Ewen commented on FLINK-6315:
-------------------------------------

For the incremental checkpointing, it works with more than one concurrent checkpoint. You
simply have some overlap.

By the time checkpoint *n* starts, it may be that checkpoint *n-1* is not complete, but there
will surely be some other complete checkpoint. It will base its diff on that one then.
If no checkpoint ever finishes before the next starts, you will have each diff checkpointed
multiple times. So it still works, but it is kind of an antipattern, because it does redundant
work.

We could make the same assumption for the eventually consistent bucketing sink.

FWIW: I think users rarely run with more than one concurrent checkpoint.

> Notify on checkpoint timeout 
> -----------------------------
>
>                 Key: FLINK-6315
>                 URL: https://issues.apache.org/jira/browse/FLINK-6315
>             Project: Flink
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Seth Wiesman
>            Assignee: Seth Wiesman
>
> A common use case when writing a custom operator that outputs data to some third party
location to partially output on checkpoint and then commit on notifyCheckpointComplete. If
that external system does not gracefully handle rollbacks (such as Amazon S3 not allowing
consistent delete operations) then that data needs to be handled by the next checkpoint. 
> The idea is to add a new interface similar to CheckpointListener that provides a callback
when the CheckpointCoordinator timesout a checkpoint
> {code:java}
> /**
>  * This interface must be implemented by functions/operations that want to receive
>  * a notification if a checkpoint has been {@link org.apache.flink.runtime.checkpoint.CheckpointCoordinator}
>  */
> public interface CheckpointTimeoutListener {
> 	/**
> 	 * This method is called as a notification if a distributed checkpoint has been timed
out.
> 	 *
> 	 * @param checkpointId The ID of the checkpoint that has been timed out.
> 	 * @throws Exception
> 	 */
> 	void notifyCheckpointTimeout(long checkpointId) throws Exception;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message