flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5214) Clean up checkpoint files when failing checkpoint operation on TM
Date Thu, 01 Dec 2016 16:47:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15712443#comment-15712443
] 

ASF GitHub Bot commented on FLINK-5214:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/2918

    [FLINK-5214] Clean up checkpoint data in case of a failing checkpoint operation

    Adds exception handling to the stream operators for the snapshotState method. In case
of an
    exception while performing the snapshot operation, all until then checkpointed data will
    be discarded/deleted. This makes sure that a failing checkpoint operation won't leave
    orphaned checkpoint data (e.g. files) behind.
    
    Add test case for FsCheckpointStateOutputStream
    
    Add RocksDB FullyAsyncSnapshot cleanup test
    
    Add proper state cleanup tests for window operator
    
    Add state cleanup test for failing snapshot call of AbstractUdfStreamOperator
    
    cc @StephanEwen

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink fixTaskCheckpointFailure

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/2918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2918
    
----
commit 35fc74dd501fc49aa0b55f415c85c2140206220a
Author: Till Rohrmann <trohrmann@apache.org>
Date:   2016-12-01T12:25:05Z

    [FLINK-5214] Clean up checkpoint data in case of a failing checkpoint operation
    
    Adds exception handling to the stream operators for the snapshotState method. In case
of an
    exception while performing the snapshot operation, all until then checkpointed data will
    be discarded/deleted. This makes sure that a failing checkpoint operation won't leave
    orphaned checkpoint data (e.g. files) behind.
    
    Add test case for FsCheckpointStateOutputStream
    
    Add RocksDB FullyAsyncSnapshot cleanup test
    
    Add proper state cleanup tests for window operator
    
    Add state cleanup test for failing snapshot call of AbstractUdfStreamOperator

----


> Clean up checkpoint files when failing checkpoint operation on TM
> -----------------------------------------------------------------
>
>                 Key: FLINK-5214
>                 URL: https://issues.apache.org/jira/browse/FLINK-5214
>             Project: Flink
>          Issue Type: Bug
>          Components: TaskManager
>    Affects Versions: 1.2.0, 1.1.3
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>             Fix For: 1.2.0, 1.1.4
>
>
> When the {{StreamTask#performCheckpoint}} operation fails on a {{TaskManager}} potentially
created checkpoint files are not cleaned up. This should be changed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message