From issues-return-150750-archive-asf-public=cust-asf.ponee.io@flink.apache.org Thu Feb 1 17:21:09 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id DF1BA18066D for ; Thu, 1 Feb 2018 17:21:09 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id CFBD8160C26; Thu, 1 Feb 2018 16:21:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 23C46160C44 for ; Thu, 1 Feb 2018 17:21:08 +0100 (CET) Received: (qmail 52218 invoked by uid 500); 1 Feb 2018 16:21:08 -0000 Mailing-List: contact issues-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list issues@flink.apache.org Received: (qmail 52209 invoked by uid 99); 1 Feb 2018 16:21:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Feb 2018 16:21:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id E0E351A672C for ; Thu, 1 Feb 2018 16:21:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -101.511 X-Spam-Level: X-Spam-Status: No, score=-101.511 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id LgWMA56-FWss for ; Thu, 1 Feb 2018 16:21:03 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTP id B74B25F2C5 for ; Thu, 1 Feb 2018 16:21:02 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 331EFE02F7 for ; Thu, 1 Feb 2018 16:21:02 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 843F221E85 for ; Thu, 1 Feb 2018 16:21:01 +0000 (UTC) Date: Thu, 1 Feb 2018 16:21:01 +0000 (UTC) From: "Stephan Ewen (JIRA)" To: issues@flink.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Closed] (FLINK-5820) Extend State Backend Abstraction to support Global Cleanup Hooks MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/FLINK-5820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-5820. ------------------------------- > Extend State Backend Abstraction to support Global Cleanup Hooks > ---------------------------------------------------------------- > > Key: FLINK-5820 > URL: https://issues.apache.org/jira/browse/FLINK-5820 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing > Affects Versions: 1.2.0 > Reporter: Stephan Ewen > Assignee: Stephan Ewen > Priority: Blocker > Fix For: 1.5.0 > > > The current state backend abstraction has the limitation that each piece of state is only meaningful in the context of its state handle. There is no possibility of a view onto "all state associated with checkpoint X". > That causes several issues > - State might not be cleaned up in the process of failures. When a TaskManager hands over a state handle to the JobManager and either of them has a failure, the state handle may be lost and state lingers. > - State might also linger if a cleanup operation failed temporarily, and the checkpoint metadata was already disposed > - State cleanup is more expensive than necessary in many cases. Each state handle is individually released. For large jobs, this means 1000s of release operations (typically file deletes) per checkpoint, which can be expensive on some file systems. > - It is hard to guarantee cleanup of parent directories with the current architecture. > The core changes proposed here are: > 1. Each job has one core {{StateBackend}}. In the future, operators may have different {{KeyedStateBackends}} and {{OperatorStateBackends}} to mix and match for example RocksDB storabe and in-memory storage. > 2. The JobManager needs to be aware of the {{StateBackend}}. > 3. Storing checkpoint metadata becomes responsibility of the state backend, not the "completed checkpoint store". The later only stores the pointers to the available latest checkpoints (either in process or in ZooKeeper). > 4. The StateBackend may optionally have a hook to drop all checkpointed state that belongs to only one specific checkpoint (shared state comes as part of incremental checkpointing). > 5. The StateBackend needs to have a hook to drop all checkpointed state up to a specific checkpoint (for all previously discarded checkpoints). > 6. In the future, this must support periodic cleanup hooks that track orphaned shared state from incremental checkpoints. > For the {{FsStateBackend}}, which stores most of the checkpointes state currently (transitively for RocksDB as well), this means a re-structuring of the storage directories as follows: > {code} > ..//job1-id/ > /shared/ <-- shared checkpoint data > /chk-1/... <-- data exclusive to checkpoint 1 > /chk-2/... <-- data exclusive to checkpoint 2 > /chk-3/... <-- data exclusive to checkpoint 3 > ..//job2-id/ > /shared/... > /chk-1/... > /chk-2/... > /chk-3/... > ..//savepoint-1/savepoint-root > /file-1-uid > /file-2-uid > /file-3-uid > /savepoint-2/savepoint-root > /file-1-uid > /file-2-uid > /file-3-uid > {code} > This is the umbrella issue for the individual steps needed to address this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)