flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Design documents for consolidated DataStream API
Date Mon, 13 Jul 2015 15:21:11 GMT
Okay, what is missing about the windowing in your opinion?

The core points of the document are:

  - The parallel windows are per group only.

  - The implementation of the parallel windows holds window data in the
group buffers.

  - The global windows are non-parallel. May have parallel pre-aggregation,
if they are time windows.

  - Time may be operator time (timer thread), or watermark time. Watermark
time can refer to ingress or event time.

  - Windows that do not pre-aggregate may require elements in order. Not
part of the first prototype.

Do we agree on those points?


On Mon, Jul 13, 2015 at 4:50 PM, Gyula Fóra <gyula.fora@gmail.com> wrote:

> In general I like it, although the main difference between the current and
> the new one is the windowing and that is still not very clear.
>
> Where do we have the full stream time windows for instance?(which is
> parallel but not keyed)
> On Mon, Jul 13, 2015 at 4:28 PM Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
> > +1 I like it as well.
> >
> > On Mon, 13 Jul 2015 at 16:17 Kostas Tzoumas <ktzoumas@apache.org> wrote:
> >
> > > +1 from my side
> > >
> > > On Mon, Jul 13, 2015 at 4:15 PM, Stephan Ewen <sewen@apache.org>
> wrote:
> > >
> > > > Do we have consensus on these designs?
> > > >
> > > > If we have, we should get to implementing this soon, because
> basically
> > > all
> > > > streaming patches will have to be revisited in light of this...
> > > >
> > > > On Tue, Jul 7, 2015 at 3:41 PM, Gyula Fóra <gyula.fora@gmail.com>
> > wrote:
> > > >
> > > > > You are right thats an important issue.
> > > > >
> > > > > And I think we should also do some renaming with the "iterations"
> > > because
> > > > > they are not really iterations like in the batch case and it might
> > > > confuse
> > > > > some users.
> > > > > Maybe we can call them loops or cycles and rename the api calls to
> > make
> > > > it
> > > > > more intuitive what happens. It is really just a cyclic dataflow.
> > > > >
> > > > > Aljoscha Krettek <aljoscha@apache.org> ezt írta (időpont:
2015.
> júl.
> > > 7.,
> > > > > K,
> > > > > 15:35):
> > > > >
> > > > > > Hi,
> > > > > > I just noticed that we don't have anything about how iterations
> and
> > > > > > timestamps/watermarks should interact.
> > > > > >
> > > > > > Cheers,
> > > > > > Aljoscha
> > > > > >
> > > > > > On Mon, 6 Jul 2015 at 23:56 Stephan Ewen <sewen@apache.org>
> wrote:
> > > > > >
> > > > > > > Hi all!
> > > > > > >
> > > > > > > As many of you know, there are a ongoing efforts to consolidate
> > the
> > > > > > > streaming API for the next release, and then graduate it
(from
> > beta
> > > > > > > status).
> > > > > > >
> > > > > > > In the process of this consolidation, we want to achieve
the
> > > > following
> > > > > > > goals.
> > > > > > >
> > > > > > >  - Make the code more robust and simplify it in parts
> > > > > > >
> > > > > > >  - Clearly define the semantics of the constructs.
> > > > > > >
> > > > > > >  - Prepare it for support of more advanced concepts, like
> > > > partitionable
> > > > > > > state, and event time.
> > > > > > >
> > > > > > >  - Cut support for certain corner cases that were prototyped,
> but
> > > > > turned
> > > > > > > out to be not efficiently doable
> > > > > > >
> > > > > > >
> > > > > > > Based on prior discussions on the mailing list, Aljoscha
and me
> > > > drafted
> > > > > > the
> > > > > > > design documents below, which outline how the consolidated
API
> > > would
> > > > > > like.
> > > > > > > We focused in constructs, time, and window semantics.
> > > > > > >
> > > > > > >
> > > > > > > Design document on how to restructure the Streaming API:
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams
> > > > > > >
> > > > > > > Design document on definitions of time, order, and the
> resulting
> > > > > > semantics:
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/Time+and+Order+in+Streams
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Note: The design of the interfaces and concepts for advanced
> > state
> > > in
> > > > > > > functions is not in here. That is part of a separate design
> > > > discussion
> > > > > > and
> > > > > > > orthogonal to the designs drafted here.
> > > > > > >
> > > > > > >
> > > > > > > Please have a look and voice questions and concerns. Since
we
> > > should
> > > > > not
> > > > > > > break the streaming API more than once, we should make
sure
> this
> > > > > > > consolidation brings it into the shape we want it to be
in.
> > > > > > >
> > > > > > >
> > > > > > > Greetings,
> > > > > > > Stephan
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message