Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 7193F200C1C for ; Wed, 15 Feb 2017 18:57:29 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 6E6F6160B70; Wed, 15 Feb 2017 17:57:29 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id BF84B160B4D for ; Wed, 15 Feb 2017 18:57:28 +0100 (CET) Received: (qmail 98369 invoked by uid 500); 15 Feb 2017 17:57:28 -0000 Mailing-List: contact commits-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list commits@flink.apache.org Received: (qmail 98356 invoked by uid 99); 15 Feb 2017 17:57:27 -0000 Received: from git1-us-west.apache.org (HELO git1-us-west.apache.org) (140.211.11.23) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 15 Feb 2017 17:57:27 +0000 Received: by git1-us-west.apache.org (ASF Mail Server at git1-us-west.apache.org, from userid 33) id BB31BDFD73; Wed, 15 Feb 2017 17:57:27 +0000 (UTC) Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: sewen@apache.org To: commits@flink.apache.org Date: Wed, 15 Feb 2017 17:57:27 -0000 Message-Id: <2d37a563f6ff43f38d7ea924929e99ef@git.apache.org> X-Mailer: ASF-Git Admin Mailer Subject: [1/2] flink git commit: [FLINK-5805] [docs] Improvements to docs for ProcessFunction archived-at: Wed, 15 Feb 2017 17:57:29 -0000 Repository: flink Updated Branches: refs/heads/master 7477c5b57 -> 5fb267de6 [FLINK-5805] [docs] Improvements to docs for ProcessFunction This closes #3317 Project: http://git-wip-us.apache.org/repos/asf/flink/repo Commit: http://git-wip-us.apache.org/repos/asf/flink/commit/5fb267de Tree: http://git-wip-us.apache.org/repos/asf/flink/tree/5fb267de Diff: http://git-wip-us.apache.org/repos/asf/flink/diff/5fb267de Branch: refs/heads/master Commit: 5fb267de68b68bc47c469f95b3bde8eebcd42007 Parents: 33ea78e Author: David Anderson Authored: Wed Feb 15 10:58:55 2017 +0100 Committer: Stephan Ewen Committed: Wed Feb 15 18:45:46 2017 +0100 ---------------------------------------------------------------------- docs/dev/stream/process_function.md | 23 ++++++++++++++--------- 1 file changed, 14 insertions(+), 9 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/flink/blob/5fb267de/docs/dev/stream/process_function.md ---------------------------------------------------------------------- diff --git a/docs/dev/stream/process_function.md b/docs/dev/stream/process_function.md index 99a3bf6..22295be 100644 --- a/docs/dev/stream/process_function.md +++ b/docs/dev/stream/process_function.md @@ -47,7 +47,7 @@ stream.keyBy("id").process(new MyProcessFunction()) The timers allow applications to react to changes in processing time and in [event time](../event_time.html). Every call to the function `processElement(...)` gets a `Context` object with gives access to the element's -event time timestamp, and the *TimerService*. The `TimerService` can be used to register callbacks for future +event time timestamp, and to the *TimerService*. The `TimerService` can be used to register callbacks for future event-/processing- time instants. When a timer's particular time is reached, the `onTimer(...)` method is called. During that call, all states are again scoped to the key with which the timer was created, allowing timers to perform keyed state manipulation as well. @@ -55,30 +55,35 @@ timers to perform keyed state manipulation as well. ## Low-level Joins -To realize low-level operations on two inputs, applications can use the `CoProcessFunction`. It relates to the `ProcessFunction` -in the same way as a `CoFlatMapFunction` relates to the `FlatMapFunction`: The function is typed to two different inputs and +To realize low-level operations on two inputs, applications can use `CoProcessFunction`. It relates to `ProcessFunction` +in the same way that `CoFlatMapFunction` relates to `FlatMapFunction`: the function is bound to two different inputs and gets individual calls to `processElement1(...)` and `processElement2(...)` for records from the two different inputs. -Implementing a low level join follows typically the pattern: +Implementing a low level join typically follows this pattern: - Create a state object for one input (or both) - Update the state upon receiving elements from its input - Upon receiving elements from the other input, probe the state and produce the joined result +For example, you might be joining customer data to financial trades, +while keeping state for the customer data. If you care about having +complete and deterministic joins in the face of out-of-order events, +you can use a timer to evaluate and emit the join for a trade when the +watermark for the customer data stream has passed the time of that +trade. ## Example -The following example maintains counts per key, and emits the key/count pair if no update happened to the key for one minute -(in event time): +The following example maintains counts per key, and emits a key/count pair whenever a minute passes (in event time) without an update for that key: - The count, key, and last-modification-timestamp are stored in a `ValueState`, which is implicitly scoped by key. - For each record, the `ProcessFunction` increments the counter and sets the last-modification timestamp - The function also schedules a callback one minute into the future (in event time) - Upon each callback, it checks the callback's event time timestamp against the last-modification time of the stored count - and emits the key/count if the match (no further update happened in that minute) + and emits the key/count if they match (i.e., no further update occurred during that minute) -*Note:* This simple example could also have been implemented on top of session windows, we simple use it to illustrate -the basic pattern of how to use the `ProcessFunction`. +*Note:* This simple example could have been implemented with session windows. We use `ProcessFunction` here to illustrate +the basic pattern it provides.