flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5007) Retain externalized checkpoint on suspension
Date Mon, 28 Nov 2016 12:45:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701881#comment-15701881
] 

ASF GitHub Bot commented on FLINK-5007:
---------------------------------------

Github user uce commented on the issue:

    https://github.com/apache/flink/pull/2750
  
    @StephanEwen Do you have time to look at this? Currently, when externalized checkpoints
are configured and the cluster shuts down via suspending all jobs, the externalized checkpoints
are cleaned up. This PR proposes to handle suspension like a cancellation and respect the
corresponding cleanup configuration, e.g. retain if `RETAIN_ON_CANCELLATION` and delete if
`DELETE_ON_CANCELLATION`.


> Retain externalized checkpoint on suspension
> --------------------------------------------
>
>                 Key: FLINK-5007
>                 URL: https://issues.apache.org/jira/browse/FLINK-5007
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>             Fix For: 1.2.0
>
>
> Externalized checkpoints are cleaned up when the job is suspended. Suspensions happen
on graceful shut down (non-HA) or loss of leadership (HA).
> In case of HA, the checkpoint store does not clean up any checkpoints as they might be
recovered by a new leader. The only way to stop a HA job is to actually cancel it. Therefore
the configured clean up behaviour doesn't matter.
> In case of non-HA, suspensions happen because of graceful shut down (for example stopping
a YARN session). In this case I would treat the clean up behaviour similar to cancelling the
job.
> {code}
> ExternalizedCheckpointCleanup.DELETE_ON_CANCELLATION => delete on suspension
> ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION => retain on suspension
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message