flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tillrohrmann <...@git.apache.org>
Subject [GitHub] flink pull request #3965: [FLINK-6328] [chkPts] Don't add savepoints to Comp...
Date Mon, 22 May 2017 15:47:02 GMT
GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/3965

    [FLINK-6328] [chkPts] Don't add savepoints to CompletedCheckpointStore

    The lifecycle of savepoints is not managed by the CheckpointCoordinator and fully
    in the hand of the user. Therefore, the CheckpointCoordinator cannot rely on them
    when trying to recover from failures. E.g. a user moving a savepoint shortly before
    a failure could completely break Flink's recovery mechanism because Flink cannot
    skip failed checkpoints when recovering.
    
    Therefore, until Flink is able to skip failed checkpoints when recovering, we should
    not add savepoints to the CompletedCheckpointStore which is used to retrieve checkpoint
    for recovery. The distinction of a savepoint is done on the basis of the
    CheckpointProperties (CheckpointProperties.STANDARD_SAVEPOINT).
    
    cc @rmetzger 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink fixSavepointHandling

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3965.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3965
    
----
commit 9c069ad80d66f03a0f90c8ba1a780cbba111896e
Author: Till Rohrmann <trohrmann@apache.org>
Date:   2017-05-22T15:41:14Z

    [FLINK-6328] [chkPts] Don't add savepoints to CompletedCheckpointStore
    
    The lifecycle of savepoints is not managed by the CheckpointCoordinator and fully
    in the hand of the user. Therefore, the CheckpointCoordinator cannot rely on them
    when trying to recover from failures. E.g. a user moving a savepoint shortly before
    a failure could completely break Flink's recovery mechanism because Flink cannot
    skip failed checkpoints when recovering.
    
    Therefore, until Flink is able to skip failed checkpoints when recovering, we should
    not add savepoints to the CompletedCheckpointStore which is used to retrieve checkpoint
    for recovery. The distinction of a savepoint is done on the basis of the
    CheckpointProperties (CheckpointProperties.STANDARD_SAVEPOINT).

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message