Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 14C4D17E32 for ; Tue, 21 Apr 2015 10:09:52 +0000 (UTC) Received: (qmail 81056 invoked by uid 500); 21 Apr 2015 10:09:52 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 81008 invoked by uid 500); 21 Apr 2015 10:09:51 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 80995 invoked by uid 99); 21 Apr 2015 10:09:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Apr 2015 10:09:51 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: message received from 54.164.171.186 which is an MX secondary for dev@flink.apache.org) Received: from [54.164.171.186] (HELO mx1-us-east.apache.org) (54.164.171.186) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 21 Apr 2015 10:09:46 +0000 Received: from mail-la0-f49.google.com (mail-la0-f49.google.com [209.85.215.49]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id C80A143E6E for ; Tue, 21 Apr 2015 10:09:25 +0000 (UTC) Received: by labbd9 with SMTP id bd9so147412887lab.2 for ; Tue, 21 Apr 2015 03:09:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=E5e9xzyoTGbJpmtCq3fboywSZnEbwPlFd2mPdmDUMHY=; b=Ejps1/MrRWnh9IWyZdb4qbcBowbfTME4/ekR87njlH+u/d1rituGqX3fMO+B1btWji BaAnin2IHC1TwmvbYL4g2EnjwRT91TvuCvxVA7Mf5XyNitJGVd684lYh+MyX8KuWNZAq GMSweHOoC+L7zIJY81dSK5YkadWqvpWbiCba4HlR0TnVvyVir9hrluCtHPhaoPIaVGbc uIoyMPXWHgaaf47AIFtJwfWMJmD9/5E9SIFBaOOJVw3XSvWC0r8kGfXOvh8Rro9tfhgo U9EbB+3NNNsXTvjPYWd4nvuU/eyZqw7ZsXX4r8UdsAqc7ILxdJLLVZ6oE4auZSsyCzxO 4FCw== MIME-Version: 1.0 X-Received: by 10.152.43.43 with SMTP id t11mr18715235lal.74.1429610964707; Tue, 21 Apr 2015 03:09:24 -0700 (PDT) Received: by 10.152.225.171 with HTTP; Tue, 21 Apr 2015 03:09:24 -0700 (PDT) In-Reply-To: References: <5536069E.4060207@informatik.hu-berlin.de> Date: Tue, 21 Apr 2015 12:09:24 +0200 Message-ID: Subject: Re: Periodic full stream aggregations From: Fabian Hueske To: "dev@flink.apache.org" Content-Type: multipart/alternative; boundary=001a11c36594b7ec280514393d0f X-Virus-Checked: Checked by ClamAV on apache.org --001a11c36594b7ec280514393d0f Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Is it possible to switch the order of the statements, i.e., dataStream.every(Time.of(4,sec)).reduce(...) instead of dataStream.reduce(...).every(Time.of(4,sec)) I think that would be more consistent with the structure of the remaining API. Cheers, Fabian 2015-04-21 10:57 GMT+02:00 Gyula F=C3=B3ra : > Hi Bruno, > > Of course you can do that as well. (That's the good part :p ) > > I will open a PR soon with the proposed changes (first without breaking t= he > current Api) and I will post it here. > > Cheers, > Gyula > > On Tuesday, April 21, 2015, Bruno Cadonna > > wrote: > > > -----BEGIN PGP SIGNED MESSAGE----- > > Hash: SHA1 > > > > Hi Gyula, > > > > I have a question regarding your suggestion. > > > > Can the current continuous aggregation be also specified with your > > proposed periodic aggregation? > > > > I am thinking about something like > > > > dataStream.reduce(...).every(Count.of(1)) > > > > Cheers, > > Bruno > > > > On 20.04.2015 22:32, Gyula F=C3=B3ra wrote: > > > Hey all, > > > > > > I think we are missing a quite useful feature that could be > > > implemented (with some slight modifications) on top of the current > > > windowing api. > > > > > > We currently provide 2 ways of aggregating (or reducing) over > > > streams: doing a continuous aggregation and always output the > > > aggregated value (which cannot be done properly in parallel) or > > > doing aggregation in a window periodically. > > > > > > What we don't have at the moment is periodic aggregations on the > > > whole stream. I would even go as far as to remove the continuous > > > outputting reduce/aggregate it and replace it with this version as > > > this in return can be done properly in parallel. > > > > > > My suggestion would be that a call: > > > > > > dataStream.reduce(..) dataStream.sum(..) > > > > > > would return a windowed data stream where the window is the whole > > > record history, and the user would need to define a trigger to get > > > the actual reduced values like: > > > > > > dataStream.reduce(...).every(Time.of(4,sec)) to get the actual > > > reduced results. dataStream.sum(...).every(...) > > > > > > I think the current data stream reduce/aggregation is very > > > confusing without being practical for any normal use-case. > > > > > > Also this would be a very api breaking change (but I would still > > > make this change as it is much more intuitive than the current > > > behaviour) so I would try to push it before the release if we can > > > agree. > > > > > > Cheers, Gyula > > > > > > > - -- > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > > Dr. Bruno Cadonna > > Postdoctoral Researcher > > > > Databases and Information Systems > > Department of Computer Science > > Humboldt-Universit=C3=A4t zu Berlin > > > > http://www.informatik.hu-berlin.de/~cadonnab > > > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > -----BEGIN PGP SIGNATURE----- > > Version: GnuPG v1.4.11 (GNU/Linux) > > > > iQEcBAEBAgAGBQJVNgaeAAoJEKdCIJx7flKwfiMH/AzPpKtse9eMOzFsXSuBslNr > > PZRQ0vpI7vw9eYFIuqp33SltN0zmLmDt3VzgJz0EZK5zSRCF9NOeke1emQwlrPsB > > g65a4XccWT2qPotodF39jTTdE5epeUf8NdE552sr+Ya5LMtt8TmozD0lEOVfNt7n > > R6KQdDU70U0zoCPwv0S13cak8a8k7phGvShXeW4nSZKp8C+WJa3IbUZkHlIlkC1L > > OnyYy4b14bnfjiknKt2mKcjLG7eQEq0X6aN85Zf+5X8BUg3auk9N9Cva2XMRuD1p > > gOoC+2gPZcr2IB9Sgs+s5pxfhaoVpbQ9Z7gRh8BkWqftveA7RD6KymmBxoUtujA=3D > > =3D8bVQ > > -----END PGP SIGNATURE----- > > > --001a11c36594b7ec280514393d0f--