From issues-return-168283-archive-asf-public=cust-asf.ponee.io@flink.apache.org  Wed May 23 17:49:04 2018
Return-Path: <issues-return-168283-archive-asf-public=cust-asf.ponee.io@flink.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [140.211.11.3])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 83072180645
	for <archive-asf-public@cust-asf.ponee.io>; Wed, 23 May 2018 17:49:03 +0200 (CEST)
Received: (qmail 15784 invoked by uid 500); 23 May 2018 15:49:02 -0000
Mailing-List: contact issues-help@flink.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:issues-help@flink.apache.org>
List-Unsubscribe: <mailto:issues-unsubscribe@flink.apache.org>
List-Post: <mailto:issues@flink.apache.org>
List-Id: <issues.flink.apache.org>
Reply-To: dev@flink.apache.org
Delivered-To: mailing list issues@flink.apache.org
Received: (qmail 15775 invoked by uid 99); 23 May 2018 15:49:02 -0000
Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23)
    by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 May 2018 15:49:02 +0000
Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33)
	id 7FE15E0C0F; Wed, 23 May 2018 15:49:02 +0000 (UTC)
From: StephanEwen <git@git.apache.org>
To: issues@flink.apache.org
Reply-To: issues@flink.apache.org
Message-ID: <git-pr-6066-flink@git.apache.org>
Subject: [GitHub] flink pull request #6066: [FLINK-9428] [checkpointing] Allow operators to fl...
Content-Type: text/plain
Date: Wed, 23 May 2018 15:49:02 +0000 (UTC)

GitHub user StephanEwen opened a pull request:

    https://github.com/apache/flink/pull/6066

    [FLINK-9428] [checkpointing] Allow operators to flush data on checkpoint pre-barrier

    ## What is the purpose of the change
    
    Some operators maintain some small transient state that may be inefficient to checkpoint, especially when it would need to be checkpointed also in a re-scalable way.
    An example are opportunistic pre-aggregation operators, which have small the pre-aggregation state that is frequently flushed downstream.
    
    Rather that persisting that state in a checkpoint, it can make sense to flush the data downstream upon a checkpoint, to let it be part of the downstream operator's state.
    
    This feature is sensitive, because flushing state has a clean implication on the downstream operator's checkpoint alignment. However, used with care, and with the new back-pressure-based checkpoint alignment, this feature can be very useful.
    
    Because it is sensitive, this PR makes this an internal feature (accessible to operators) and does NOT expose it in the public API.
    
    ## Brief change log
    
      - Adds the `prepareSnapshotPreBarrier(long checkpointId)` call to `(Abstract)StreamOperator`, with an empty default implementation.
      - Adds a call on `OperatorChain` to call this in front-to-back order on the operators.
    
    ## Verifying this change
    
      - This change does not yet alter any behavior, it adds only a plug point for future stream operators.
      - The `OperatorChainTest` Unit Test validates that the call happens, and that operators are called in the right order.
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): **no**
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: **no**
      - The serializers: **no**
      - The runtime per-record code paths (performance sensitive): **no**
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: **yes**
      - The S3 file system connector: **no**
    
    ## Documentation
    
      - Does this pull request introduce a new feature? **no**
      - If yes, how is the feature documented? **not applicable**


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/StephanEwen/incubator-flink pre_barrier

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/6066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #6066
    
----

----


---