flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Thoughts About Streaming
Date Tue, 23 Jun 2015 08:03:36 GMT
For the windowing designs, we should also have in mind what requirements we
have on the way we keep/store the elements (in external stores, Flink
managed memory, ...)

On Tue, Jun 23, 2015 at 9:55 AM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> The reason I posted this now is that we need to think about the API and
> windowing before proceeding with the PRs of Gabor (inverse reduce) and
> Gyula (removal of "aggregate" functions on DataStream).
>
> For the windowing, I think that the current model does not work for
> out-of-order processing. Therefore, the whole windowing infrastructure will
> basically have to be redone. Meaning also that any work on the
> pre-aggregators or optimizations that we do now becomes useless.
>
> For the API, I proposed to restructure the interactions between all the
> different *DataStream classes and grouping/windowing. (See API section of
> the doc I posted.)
>
> On Mon, 22 Jun 2015 at 21:56 Gyula Fóra <gyula.fora@gmail.com> wrote:
>
> > Hi Aljoscha,
> >
> > Thanks for the nice summary, this is a very good initiative.
> >
> > I added some comments to the respective sections (where I didnt fully
> agree
> > :).).
> > At some point I think it would be good to have a public hangout session
> on
> > this, which could make a more dynamic discussion.
> >
> > Cheers,
> > Gyula
> >
> > Aljoscha Krettek <aljoscha@apache.org> ezt írta (időpont: 2015. jún.
> 22.,
> > H, 21:34):
> >
> > > Hi,
> > > with people proposing changes to the streaming part I also wanted to
> > throw
> > > my hat into the ring. :D
> > >
> > > During the last few months, while I was getting acquainted with the
> > > streaming system, I wrote down some thoughts I had about how things
> could
> > > be improved. Hopefully, they are in somewhat coherent shape now, so
> > please
> > > have a look if you are interested in this:
> > >
> > >
> >
> https://docs.google.com/document/d/1rSoHyhUhm2IE30o5tkR8GEetjFvMRMNxvsCfoPsW6_4/edit?usp=sharing
> > >
> > > This mostly covers:
> > >  - Timestamps assigned at sources
> > >  - Out-of-order processing of elements in window operators
> > >  - API design
> > >
> > > Please let me know what you think. Comment in the document or here in
> the
> > > mailing list.
> > >
> > > I have a PR in the makings that would introduce source timestamps and
> > > watermarks for keeping track of them. I also hacked a proof-of-concept
> > of a
> > > windowing system that is able to process out-of-order elements using a
> > > FlatMap operator. (It uses panes to perform efficient
> pre-aggregations.)
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message