Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3F7C420049D for ; Wed, 9 Aug 2017 15:26:15 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3E178168F58; Wed, 9 Aug 2017 13:26:15 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5B733168F57 for ; Wed, 9 Aug 2017 15:26:14 +0200 (CEST) Received: (qmail 19724 invoked by uid 500); 9 Aug 2017 13:26:13 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@flink.apache.org Received: (qmail 19714 invoked by uid 99); 9 Aug 2017 13:26:13 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 09 Aug 2017 13:26:13 +0000 Received: from aljoschas-mbp.fritz.box (ipservice-092-219-057-167.092.219.pools.vodafone-ip.de [92.219.57.167]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 12B751A0040; Wed, 9 Aug 2017 13:26:11 +0000 (UTC) From: Aljoscha Krettek Message-Id: Content-Type: multipart/alternative; boundary="Apple-Mail=_8B9EAD7B-C4F9-4FB2-A4BC-56B4DA0A696A" Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: Re: Flink - Handling late events - main vs late window firing Date: Wed, 9 Aug 2017 15:26:08 +0200 In-Reply-To: <1352149323.725577.1502037943353@mail.yahoo.com> Cc: User To: M Singh References: <1352149323.725577.1502037943353.ref@mail.yahoo.com> <1352149323.725577.1502037943353@mail.yahoo.com> X-Mailer: Apple Mail (2.3273) archived-at: Wed, 09 Aug 2017 13:26:15 -0000 --Apple-Mail=_8B9EAD7B-C4F9-4FB2-A4BC-56B4DA0A696A Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii Hi, 1. You could use a ProcessWindowFunction instead of a WindowFunction. In = there, you can query the current watermark and thus determine why the = firing is happening. Also, in a ProcessWindowFunction you can keep = per-window state, this would allow you to keep a bit of state that can = tell you whether this is the first firing for a given window or the = number of firings so far. 2. This depends on whether the Trigger is purging or not. The default = EventTimeTrigger is not purging, meaning that all elements in the window = will be preserved after firing (until the watermark reaches the end of = the window plus the allowed lateness). You can turn this into a purging = trigger using PurgingTrigger.of(EventTimeTrigger.create()). You would = specify this using .trigger() on WindowedStream when constructing your = windowed operation. 3. It doesn't, you have to manually keep state in a = ProcessWindowFunction to distinguish between different cases, as = mentioned above. 4. Currently, I think there are no examples because this depends to a = large degree on the specifics of the application. I'm afraid. Best, Aljoscha > On 6. Aug 2017, at 18:45, M Singh wrote: >=20 > Hi Folks: >=20 > I am going through flink documentation and it states the following: >=20 > "You should be aware that the elements emitted by a late firing should = be treated as updated results of a previous computation, i.e., your data = stream will contain multiple results for the same computation. Depending = on your application, you need to take these duplicated results into = account or deduplicate them." >=20 > I wanted to find out the following: >=20 > 1. How do we distinguish the late firing from the main firing ? > 2. Does the late firing including all events or only late events ? > 3. How does the late vs main firing affect the associated window = function ? > 4. Are there any examples of how to handle these events and = deduplication mentioned in the documentation ? >=20 > Thanks for your help. >=20 > Mans --Apple-Mail=_8B9EAD7B-C4F9-4FB2-A4BC-56B4DA0A696A Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii Hi,

1. = You could use a ProcessWindowFunction instead of a WindowFunction. In = there, you can query the current watermark and thus determine why the = firing is happening. Also, in a ProcessWindowFunction you can keep = per-window state, this would allow you to keep a bit of state that can = tell you whether this is the first firing for a given window or the = number of firings so far.

2. This depends on whether the Trigger is purging or not. The = default EventTimeTrigger is not purging, meaning that all elements in = the window will be preserved after firing (until the watermark reaches = the end of the window plus the allowed lateness). You can turn this into = a purging trigger using PurgingTrigger.of(EventTimeTrigger.create()). = You would specify this using .trigger() on WindowedStream when = constructing your windowed operation.

3. It doesn't, you have to manually = keep state in a ProcessWindowFunction to distinguish between different = cases, as mentioned above.

4. Currently, I think there are no examples because this = depends to a large degree on the specifics of the application. I'm = afraid.

Best,
Aljoscha

On = 6. Aug 2017, at 18:45, M Singh <mans2singh@yahoo.com> wrote:

Hi = Folks:

I am going = through flink documentation and it states the following:

"You should be aware that the elements emitted by a late = firing should be treated as updated results of a previous computation, = i.e., your data stream will contain multiple results for the same = computation. Depending on your application, you need to take these = duplicated results into account or deduplicate them."

I wanted to find = out the following:
1. How do we = distinguish the late firing from the main firing ?
2. Does the late = firing including all events or only late events ?
3. How does the = late vs main firing affect the associated window function ?
4. Are there any examples of how to handle these events and = deduplication mentioned in the documentation ?
Thanks for your help.
Mans

= --Apple-Mail=_8B9EAD7B-C4F9-4FB2-A4BC-56B4DA0A696A--