flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Periodic full stream aggregations
Date Tue, 21 Apr 2015 10:09:24 GMT
Is it possible to switch the order of the statements, i.e.,

dataStream.every(Time.of(4,sec)).reduce(...) instead of
dataStream.reduce(...).every(Time.of(4,sec))

I think that would be more consistent with the structure of the remaining
API.

Cheers, Fabian

2015-04-21 10:57 GMT+02:00 Gyula Fóra <gyfora@apache.org>:

> Hi Bruno,
>
> Of course you can do that as well. (That's the good part :p )
>
> I will open a PR soon with the proposed changes (first without breaking the
> current Api) and I will post it here.
>
> Cheers,
> Gyula
>
> On Tuesday, April 21, 2015, Bruno Cadonna <cadonna@informatik.hu-berlin.de
> >
> wrote:
>
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Hi Gyula,
> >
> > I have a question regarding your suggestion.
> >
> > Can the current continuous aggregation be also specified with your
> > proposed periodic aggregation?
> >
> > I am thinking about something like
> >
> > dataStream.reduce(...).every(Count.of(1))
> >
> > Cheers,
> > Bruno
> >
> > On 20.04.2015 22:32, Gyula Fóra wrote:
> > > Hey all,
> > >
> > > I think we are missing a quite useful feature that could be
> > > implemented (with some slight modifications) on top of the current
> > > windowing api.
> > >
> > > We currently provide 2 ways of aggregating (or reducing) over
> > > streams: doing a continuous aggregation and always output the
> > > aggregated value (which cannot be done properly in parallel) or
> > > doing aggregation in a window periodically.
> > >
> > > What we don't have at the moment is periodic aggregations on the
> > > whole stream. I would even go as far as to remove the continuous
> > > outputting reduce/aggregate it and replace it with this version as
> > > this in return can be done properly in parallel.
> > >
> > > My suggestion would be that a call:
> > >
> > > dataStream.reduce(..) dataStream.sum(..)
> > >
> > > would return a windowed data stream where the window is the whole
> > > record history, and the user would need to define a trigger to get
> > > the actual reduced values like:
> > >
> > > dataStream.reduce(...).every(Time.of(4,sec)) to get the actual
> > > reduced results. dataStream.sum(...).every(...)
> > >
> > > I think the current data stream reduce/aggregation is very
> > > confusing without being practical for any normal use-case.
> > >
> > > Also this would be a very api breaking change (but I would still
> > > make this change as it is much more intuitive than the current
> > > behaviour) so I would try to push it before the release if we can
> > > agree.
> > >
> > > Cheers, Gyula
> > >
> >
> > - --
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> >   Dr. Bruno Cadonna
> >   Postdoctoral Researcher
> >
> >   Databases and Information Systems
> >   Department of Computer Science
> >   Humboldt-Universität zu Berlin
> >
> >   http://www.informatik.hu-berlin.de/~cadonnab
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.11 (GNU/Linux)
> >
> > iQEcBAEBAgAGBQJVNgaeAAoJEKdCIJx7flKwfiMH/AzPpKtse9eMOzFsXSuBslNr
> > PZRQ0vpI7vw9eYFIuqp33SltN0zmLmDt3VzgJz0EZK5zSRCF9NOeke1emQwlrPsB
> > g65a4XccWT2qPotodF39jTTdE5epeUf8NdE552sr+Ya5LMtt8TmozD0lEOVfNt7n
> > R6KQdDU70U0zoCPwv0S13cak8a8k7phGvShXeW4nSZKp8C+WJa3IbUZkHlIlkC1L
> > OnyYy4b14bnfjiknKt2mKcjLG7eQEq0X6aN85Zf+5X8BUg3auk9N9Cva2XMRuD1p
> > gOoC+2gPZcr2IB9Sgs+s5pxfhaoVpbQ9Z7gRh8BkWqftveA7RD6KymmBxoUtujA=
> > =8bVQ
> > -----END PGP SIGNATURE-----
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message