Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id BF06C200B62 for ; Fri, 12 Aug 2016 11:15:49 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id BD847160AB6; Fri, 12 Aug 2016 09:15:49 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id DCDDD160AB0 for ; Fri, 12 Aug 2016 11:15:48 +0200 (CEST) Received: (qmail 38882 invoked by uid 500); 12 Aug 2016 09:11:52 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 38831 invoked by uid 99); 12 Aug 2016 09:11:52 -0000 Received: from mail-relay.apache.org (HELO mail-relay.apache.org) (140.211.11.15) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Aug 2016 09:11:52 +0000 Received: from mail-io0-f178.google.com (mail-io0-f178.google.com [209.85.223.178]) by mail-relay.apache.org (ASF Mail Server at mail-relay.apache.org) with ESMTPSA id 71A7A1A00C5 for ; Fri, 12 Aug 2016 09:11:52 +0000 (UTC) Received: by mail-io0-f178.google.com with SMTP id q83so19522303iod.1 for ; Fri, 12 Aug 2016 02:11:52 -0700 (PDT) X-Gm-Message-State: AEkooutZinO2fsSBigT9EksR/YgVaO9q8tinwH4mtKJbZ/ksJJMZSBdnNCU7uep1a/URbFlDLWjNxvpJGgIDHQ== X-Received: by 10.107.184.70 with SMTP id i67mr16431206iof.26.1470993111700; Fri, 12 Aug 2016 02:11:51 -0700 (PDT) MIME-Version: 1.0 References: <73AD744D-5ABF-4810-B1EB-2EC3370CCED4@axiomine.com> In-Reply-To: From: Aljoscha Krettek Date: Fri, 12 Aug 2016 09:11:41 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Does Flink DataStreams using combiners? To: "user@flink.apache.org" Content-Type: multipart/alternative; boundary=94eb2c06d362e3dae50539dc4533 archived-at: Fri, 12 Aug 2016 09:15:49 -0000 --94eb2c06d362e3dae50539dc4533 Content-Type: text/plain; charset=UTF-8 Hi, Sameer is right that Flink currently does not combine for any combination of assigner, trigger and window function. Technically, it would be possible to use a combiner for Triggers that don't observe individual elements but only fire on time. With triggers that observe elements, such as CountTrigger it becomes impossible to figure out when to fire. Cheers, Aljoscha On Fri, 12 Aug 2016 at 03:36 Sameer W wrote: > Sorry I mean streaming cannot use combiners (repeated below) > ------- > Streaming cannot use combiners. The aggregations happen on the trigger. > > The elements being aggregated are only known after the trigger delivers > the elements to the evaluation function. > > Since windows can overlap and even assignment to a window is not done > until the elements arrive at the sum operator in your case, combiner cannot > know what to pre aggregate even if were available. > > On Thu, Aug 11, 2016 at 9:22 PM, Sameer Wadkar > wrote: > >> Streaming cannot use windows. The aggregations happen on the trigger. >> >> The elements being aggregated are only known after the trigger delivers >> the elements to the evaluation function. >> >> Since windows can overlap and even assignment to a window is not done >> until the elements arrive at the sum operator in your case, combiner cannot >> know what to pre aggregate even if were available. > > >> >> >> >> > On Aug 11, 2016, at 8:51 PM, Elias Levy >> wrote: >> > >> > I am wondering if Flink makes use of combiners to pre-reduce a keyed >> and windowed stream before shuffling the data among workers. >> > >> > I.e. will it use a combiner in something like: >> > >> > stream.flatMap {...} >> > .assignTimestampsAndWatermarks(...) >> > .keyBy(...) >> > .timeWindow(...) >> > .trigger(...) >> > .sum("cnt") >> > >> > or will it shuffle the keyed input before the sum reduction? >> > >> > If it does make use of combiners, it would be useful to point this out >> in the documentation, particularly if it only applies to certain types of >> reducers, folds, etc. >> > --94eb2c06d362e3dae50539dc4533 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,
Sameer is right that Flink currently does not comb= ine for any combination of assigner, trigger and window function.

Technically, it would be possible to use a combiner for Tri= ggers that don't observe individual elements but only fire on time. Wit= h triggers that observe elements, such as CountTrigger it becomes impossibl= e to figure out when to fire.

Cheers,
Al= joscha

On Fri, 1= 2 Aug 2016 at 03:36 Sameer W <sam= eer@axiomine.com> wrote:
Sorry I mean streaming cannot use combiners (repeated below)<= div>-------
Streaming cannot use= combiners. The aggregations happen on the trigger.

The elements being aggregated are only known after the trigger delivers = the elements to the evaluation function.

Since wi= ndows can overlap and even assignment to a window is not done until the ele= ments arrive at the sum operator in your case, combiner cannot know what to= pre aggregate even if were available.

On Thu, Aug 11, 2016 at 9:22 PM= , Sameer Wadkar <sameer@axiomine.com> wrote:
Streaming cannot use windows. The aggregations happen = on the trigger.

The elements being aggregated are only known after the trigger delivers the= elements to the evaluation function.

Since windows can overlap and even assignment to a window is not done until= the elements arrive at the sum operator in your case, combiner cannot know= what to pre aggregate even if were available.




> On Aug 11, 2016, at 8:51 PM, Elias Levy <fearsome.lucidity@gmail.com> = wrote:
>
> I am wondering if Flink makes use of combiners to pre-reduce a keyed a= nd windowed stream before shuffling the data among workers.
>
> I.e. will it use a combiner in something like:
>
> stream.flatMap {...}
>=C2=A0 =C2=A0 =C2=A0 =C2=A0.assignTimestampsAndWatermarks(...)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0.keyBy(...)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0.timeWindow(...)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0.trigger(...)
>=C2=A0 =C2=A0 =C2=A0 =C2=A0.sum("cnt")
>
> or will it shuffle the keyed input before the sum reduction?
>
> If it does make use of combiners, it would be useful to point this out= in the documentation, particularly if it only applies to certain types of = reducers, folds, etc.
--94eb2c06d362e3dae50539dc4533--