flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-4809) Operators should tolerate checkpoint failures
Date Tue, 24 Oct 2017 12:47:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-4809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16216834#comment-16216834
] 

ASF GitHub Bot commented on FLINK-4809:
---------------------------------------

Github user aljoscha commented on a diff in the pull request:

    https://github.com/apache/flink/pull/4883#discussion_r146543640
  
    --- Diff: flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/CheckpointConfig.java
---
    @@ -231,6 +234,23 @@ public void setForceCheckpointing(boolean forceCheckpointing) {
     	}
     
     	/**
    +	 * This determines the behaviour of tasks if there is an error in their checkpointing.
If this returns true,
    +	 * tasks will fail as a reaction. If this returns false, task will only decline the
failed checkpoint.
    +	 */
    +	public boolean isFailTasksOnCheckpointingErrors() {
    --- End diff --
    
    I think we typically don't talk about tasks in the user facing APIs. This could be `isFailOnCheckpointingErrors()`,
for example.


> Operators should tolerate checkpoint failures
> ---------------------------------------------
>
>                 Key: FLINK-4809
>                 URL: https://issues.apache.org/jira/browse/FLINK-4809
>             Project: Flink
>          Issue Type: Sub-task
>          Components: State Backends, Checkpointing
>            Reporter: Stephan Ewen
>            Assignee: Stefan Richter
>             Fix For: 1.4.0
>
>
> Operators should try/catch exceptions in the synchronous and asynchronous part of the
checkpoint and send a {{DeclineCheckpoint}} message as a result.
> The decline message should have the failure cause attached to it.
> The checkpoint barrier should be sent anyways as a first step before attempting to make
a state checkpoint, to make sure that downstream operators do not block in alignment.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message