Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id E08E4200B4B for ; Thu, 21 Jul 2016 20:22:41 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DEF2B160A73; Thu, 21 Jul 2016 18:22:41 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 0ECCE160A72 for ; Thu, 21 Jul 2016 20:22:40 +0200 (CEST) Received: (qmail 14919 invoked by uid 500); 21 Jul 2016 18:22:39 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 14909 invoked by uid 99); 21 Jul 2016 18:22:39 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Jul 2016 18:22:39 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 74DA2C0C1A for ; Thu, 21 Jul 2016 18:22:39 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.179 X-Spam-Level: * X-Spam-Status: No, score=1.179 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=uber.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id zTyKDaesnnap for ; Thu, 21 Jul 2016 18:22:37 +0000 (UTC) Received: from mail-oi0-f53.google.com (mail-oi0-f53.google.com [209.85.218.53]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 4FF8A60CD2 for ; Thu, 21 Jul 2016 18:22:36 +0000 (UTC) Received: by mail-oi0-f53.google.com with SMTP id w18so129866520oiw.3 for ; Thu, 21 Jul 2016 11:22:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uber.com; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=CZyEo341EOORF/Xgem3dhI0VvSFmdo4j27g5Bt/Zn98=; b=D9OIvvIhRNAuMn4/XejdMmLA48OpHzf+k5pzoDbHs8yTw4yzNw1NhE+kMwf+Efz1OA pJPtUppSp49yDpeQxm+J8FhlGyriPKYP+tGyiG3F9H+SjRczbTaBC2UkR+XQNtCiQv2B R1o3Z5YeIVQj5Mo59OYJaqbMDTrSjUbl3pYCk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=CZyEo341EOORF/Xgem3dhI0VvSFmdo4j27g5Bt/Zn98=; b=j8Dl2QOKS6fC64AXvNT7+BJeMReGL6nFoXduq7L2OdlxYiKH497Jx0mTWRLgpUXYR4 ihVe61NbAmq6Cx6jk1CELTv/UlK0AxXi9YvarLO7LWu/RMDvhqXtWeF9aTGKvK0k/E8r vTqAls+fYYxH7xWYr9oNwbXMHkg89PKdBangUfTmQD9FF6TMJVEqPPu53Xuc+VqnQWVI YX2Vmr2/yERRoSZR5Nfi1TkdwvJHFvR2C4Pts3U46uL6nkZWchUrD0EClvXciW47Ed9F y6Tmb3pWoXnShlj4KR6AZmciyv8YMuYtMrU+seYLEE1xMM893Ovbf3zqFsWtgYihd1o7 1keA== X-Gm-Message-State: ALyK8tKhxmZu2JSDFHJoWRbw+IO4Bhp/QJZZx/HHleJ5sGUtAKvfL0z0Cw8PdcdMRaAR1ox1YIx+9znhVtH6Qomp X-Received: by 10.202.77.151 with SMTP id a145mr26198408oib.163.1469125354734; Thu, 21 Jul 2016 11:22:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.43.72 with HTTP; Thu, 21 Jul 2016 11:22:15 -0700 (PDT) In-Reply-To: References: <2A57CC9D-A649-47D0-8DCE-BF965D62A8FF@uber.com> From: David Desberg Date: Thu, 21 Jul 2016 11:22:15 -0700 Message-ID: Subject: Re: Processing windows in event time order To: user@flink.apache.org Content-Type: multipart/alternative; boundary=001a113dedcce6c40b0538296633 archived-at: Thu, 21 Jul 2016 18:22:42 -0000 --001a113dedcce6c40b0538296633 Content-Type: text/plain; charset=UTF-8 Aljoscha, Awesome. Exactly the behavior I was hoping would be exhibited. Thank you for the quick answer :) Thanks, David On Thu, Jul 21, 2016 at 2:17 AM, Aljoscha Krettek wrote: > Hi David, > windows are being processed in order of their end timestamp. So if you > specify an allowed lateness of zero (which will only be possible on Flink > 1.1 or by using a custom trigger) you should be able to sort the elements. > The ordering is only valid within one key, though, since windows for > different keys with the same end timestamp will be processed in an > arbitrary order. > > @Sameer If both sources emit watermarks that are correct for the elements > that they are emitting the Trigger should only fire when both sources > progressed their watermarks sufficiently far. Could you maybe give a more > detailed example of the problem that you described? > > Cheers, > Aljoscha > > > On Thu, 21 Jul 2016 at 04:03 Sameer Wadkar wrote: > >> Hi, >> >> If watermarks arriving from multiple sources, how long does the Event >> Time Trigger wait for the slower source to send its watermarks before >> triggering only from the faster source? I have seen that if one of the >> sources is really slow then the elements of the faster source fires and >> when the elements arrive from the slower source, the same window fires >> again with the new elements only. I can work around this by adding delays >> but does merging watermarks require that both have arrived by the time the >> watermarks progress to the point where a window can be triggered? Is >> applying a delay in the watermark the only way to solve this. >> >> Sameer >> >> Sent from my iPhone >> >> On Jul 20, 2016, at 9:41 PM, Vishnu Viswanath < >> vishnu.viswanath25@gmail.com> wrote: >> >> Hi David, >> >> You are right, the events in the window are not sorted according to the >> EventTime hence the processing is not done in an increasing order of >> timestamp. >> As you said, you will have to do the sorting yourself in your window >> function to make sure that you are processing the events in order. >> >> What Flink does is (when EventTime is set and timestamp is assigned), it >> will assign the elements to the Windows based on the EventTime, which >> otherwise (if using ProcessingTime) might have ended up in a different >> Window. (as per the ProcessingTime). >> >> This is as per my limited knowledge, other Flink experts can correct me >> if this is wrong. >> >> Thanks, >> Vishnu >> >> On Wed, Jul 20, 2016 at 9:30 PM, David Desberg >> wrote: >> >>> Hi all, >>> >>> In Flink, after setting the time characteristic to event time and >>> properly assigning timestamps/watermarks, time-based windows will be >>> created based upon event time. If we need to process events within a window >>> in event time order, we can sort the windowed values and process as >>> necessary by applying a WindowFunction. However, as I understand it, there >>> is no guarantee that time-based windows will be processed in time order. Is >>> this correct? Or, if we assume a watermarking system that (for example's >>> sake) does not allow any late events, is there a way within Flink to >>> guarantee that windows will be processed (via an applied WindowFunction) in >>> strictly increasing time order? >>> >>> If necessary, I can provide a more concrete explanation of what I >>> mean/am looking for. >>> >>> Thanks! >>> David >> >> >> --001a113dedcce6c40b0538296633 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Aljoscha,
=
Awesome. Exactly the behavior I was hoping would be exhibited.= Thank you for the quick answer :)

Thanks,
David

On Thu, Jul 21, 2016= at 2:17 AM, Aljoscha Krettek <aljoscha@apache.org> wrote:=
Hi David,
windows a= re being processed in order of their end timestamp. So if you specify an al= lowed lateness of zero (which will only be possible on Flink 1.1 or by usin= g a custom trigger) you should be able to sort the elements. The ordering i= s only valid within one key, though, since windows for different keys with = the same end timestamp will be processed in an arbitrary order.
<= br>
@Sameer If both sources emit watermarks that are correct for = the elements that they are emitting the Trigger should only fire when both = sources progressed their watermarks sufficiently far. Could you maybe give = a more detailed example of the problem that you described?

Cheers,
Aljoscha


On Thu, 21 Jul 2016 at 04:03 Sameer Wadkar <sameer@axiomine.com> wrote:
Hi,

<= /div>
If watermarks arriving from multiple sources, how long does the E= vent Time Trigger wait for the slower source to send its watermarks before = triggering only from the faster source? I have seen that if one of the sour= ces is really slow then the elements of the faster source fires and when th= e elements arrive from the slower source, the same window fires again with = the new elements only. I can work around this by adding delays but does mer= ging watermarks require that both have arrived by the time the watermarks p= rogress to the point where a window can be triggered? Is applying a delay i= n the watermark the only way to solve this.=C2=A0

= Sameer

Sent from my iPhone

On = Jul 20, 2016, at 9:41 PM, Vishnu Viswanath <vishnu.viswanath25@gmail.com> = wrote:

Hi= David,

You are right, the events in the window ar= e not sorted according to the EventTime hence the processing is not done in= an increasing order of timestamp.
As you said, you will have to = do the sorting yourself in your window function to make sure that you are p= rocessing the events in order.

What Flink does is = (when EventTime is set and timestamp is assigned), it will assign the eleme= nts to the Windows based on the EventTime, which otherwise (if using Proces= singTime) might have ended up in a different Window. (as per the Processing= Time).

This is as per my limited knowledge, other = Flink experts can correct me if this is wrong.

Tha= nks,
Vishnu=C2=A0

On Wed, Jul 20, 2016 at 9:30 PM, David Desberg <da= vid.desberg@uber.com> wrote:
David

--001a113dedcce6c40b0538296633--