flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: Thoughts About Streaming
Date Tue, 23 Jun 2015 08:37:33 GMT
Hey

I think we should not block PRs unnecessarily if your suggested changes
might touch them at some point.

Also I still think we should not put everything in the Datastream because
it will be a huge mess.

Also we need to agree on the out of order processing, whether we want it
the way you proposed it(which is quite costly). Another alternative
approach there which fits in the current windowing is to filter out if
order events and apply a special handling operator on them. This would be
fairly lightweight.

My point is that we need to consider some alternative solutions. And we
should not block contributions along the way.

Cheers
Gyula

On Tue, Jun 23, 2015 at 9:55 AM Aljoscha Krettek <aljoscha@apache.org>
wrote:

> The reason I posted this now is that we need to think about the API and
> windowing before proceeding with the PRs of Gabor (inverse reduce) and
> Gyula (removal of "aggregate" functions on DataStream).
>
> For the windowing, I think that the current model does not work for
> out-of-order processing. Therefore, the whole windowing infrastructure will
> basically have to be redone. Meaning also that any work on the
> pre-aggregators or optimizations that we do now becomes useless.
>
> For the API, I proposed to restructure the interactions between all the
> different *DataStream classes and grouping/windowing. (See API section of
> the doc I posted.)
>
> On Mon, 22 Jun 2015 at 21:56 Gyula Fóra <gyula.fora@gmail.com> wrote:
>
> > Hi Aljoscha,
> >
> > Thanks for the nice summary, this is a very good initiative.
> >
> > I added some comments to the respective sections (where I didnt fully
> agree
> > :).).
> > At some point I think it would be good to have a public hangout session
> on
> > this, which could make a more dynamic discussion.
> >
> > Cheers,
> > Gyula
> >
> > Aljoscha Krettek <aljoscha@apache.org> ezt írta (időpont: 2015. jún.
> 22.,
> > H, 21:34):
> >
> > > Hi,
> > > with people proposing changes to the streaming part I also wanted to
> > throw
> > > my hat into the ring. :D
> > >
> > > During the last few months, while I was getting acquainted with the
> > > streaming system, I wrote down some thoughts I had about how things
> could
> > > be improved. Hopefully, they are in somewhat coherent shape now, so
> > please
> > > have a look if you are interested in this:
> > >
> > >
> >
> https://docs.google.com/document/d/1rSoHyhUhm2IE30o5tkR8GEetjFvMRMNxvsCfoPsW6_4/edit?usp=sharing
> > >
> > > This mostly covers:
> > >  - Timestamps assigned at sources
> > >  - Out-of-order processing of elements in window operators
> > >  - API design
> > >
> > > Please let me know what you think. Comment in the document or here in
> the
> > > mailing list.
> > >
> > > I have a PR in the makings that would introduce source timestamps and
> > > watermarks for keeping track of them. I also hacked a proof-of-concept
> > of a
> > > windowing system that is able to process out-of-order elements using a
> > > FlatMap operator. (It uses panes to perform efficient
> pre-aggregations.)
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message