flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Event time in Flink streaming
Date Fri, 28 Aug 2015 15:06:34 GMT
Hi Martin,
the answer depends, because the current windowing implementation has some
problems. We are working on improving it in the 0.10 release, though.

If your elements arrive with strictly increasing timestamps and you have
parallelism=1 or don't perform any re-partitioning of data (which a
groupBy() does, for example) then what Matthias proposed works for you. If
not then you can get intro problems with out-of-order elements and windows
will be incorrectly determined.

If you are interested in what we are working on for 0.10, please look at
the design documents here
The basic idea is to make windows work correctly when elements arrive not
ordered by timestamps. For this we want use watermarks as popularized, for
example, by Google Dataflow.

Please ask if you have questions about this or are interested in joining
the discussion (the design as not yet finalized, both API and
implementation). :D


P.S. I have some proof-of-concept work in a branch of mine, if you
interested in my work there I could give you access to it.

On Fri, 28 Aug 2015 at 16:11 Matthias J. Sax <mjsax@informatik.hu-berlin.de>

> Hi Martin,
> you need to implement you own policy. However, this should be be
> complicated. Have a look at "TimeTriggerPolicy". You just need to
> provide a "Timestamp" implementation that extracts you ts-attribute from
> the tuples.
> -Matthias
> On 08/28/2015 03:58 PM, Martin Neumann wrote:
> > Hej,
> >
> > I have a stream of timestamped events I want to process in Flink
> streaming.
> > Di I have to write my own policies to do so, or can define time based
> > windows to use the timestamps instead of the system time?
> >
> > cheers Martin

View raw message