flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephan Ewen (JIRA)" <j...@apache.org>
Subject [jira] [Created] (FLINK-8531) Support separation of "Exclusive", "Shared" and "Task owned" state
Date Tue, 30 Jan 2018 18:32:01 GMT
Stephan Ewen created FLINK-8531:

             Summary: Support separation of "Exclusive", "Shared" and "Task owned" state
                 Key: FLINK-8531
                 URL: https://issues.apache.org/jira/browse/FLINK-8531
             Project: Flink
          Issue Type: Sub-task
          Components: State Backends, Checkpointing
            Reporter: Stephan Ewen
            Assignee: Stephan Ewen
             Fix For: 1.5.0

Currently, all state created at a certain checkpoint goes into the directory {{chk-id}}.

With incremental checkpointing, some state is shared across checkpoint and is referenced by
newer checkpoints. That way, old {{chk-id}} directories stay around, containing some shared
chunks. That makes it both for users and cleanup hooks hard to determine when a {{chk-x}}
directory could be deleted.

The same holds for state that can only every be dropped by certain operators on the TaskManager,
never by the JobManager / CheckpointCoordinator. Examples of that state are write ahead logs,
which need to be retained until the move to the target system is complete, which may in some
cases be later then when the checkpoint that created them is disposed.

I propose to introduce different scopes for tasks:
  - **EXCLUSIVE** is for state that belongs to one checkpoint only
  - **SHARED** is for state that is possibly part of multiple checkpoints
  - **TASKOWNED** is for state that must never by dropped by the JobManager.

For file based checkpoint targets, I propose that we have the following directory layout:
    + --shared/
    + --taskowned/
    + --chk-00001/
    + --chk-00002/
    + --chk-00003/

This message was sent by Atlassian JIRA

View raw message