flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Rodríguez Hortalá <juan.rodriguez.hort...@gmail.com>
Subject Re: Early events
Date Sun, 20 Nov 2016 05:18:13 GMT
Hi,

There was a bug in my code, I was assigning the timestamps wrong and that
is why it looked like early events where assigned processing time.
Surprisingly enought my test works both ok with early events. In fact I
have modified my test data generator to generate early events or late
events, and both seem to work ok with my test (
https://github.com/juanrh/flink-state-eviction/blob/293fe1cf972b2e4bc6fb4e874eb8ba70c78f7894/src/main/java/com/github/juanrh/streaming/source/EventTimeDelayedElementsSource.java,
https://github.com/juanrh/flink-state-eviction/blob/293fe1cf972b2e4bc6fb4e874eb8ba70c78f7894/src/test/java/com/github/juanrh/streaming/source/EventTimeDelayedElementsSourceTest.java
)

Anyway, is this the expected behaviour for early events? Is Flink buffering
early events until their future timestamp arrives?

Thanks,

Juan


On Sat, Nov 19, 2016 at 8:31 PM, Juan Rodríguez Hortalá <
juan.rodriguez.hortala@gmail.com> wrote:

> Hi,
>
> Maybe this is already in the documentation, sorry if I'm asking something
> obvious. I was thinking that if you have event time then you can also have
> early events, which would be events whose extracted timestampt is in the
> future. This might happen in practice for example in sensors with a skewed
> clock, that assign timestamps in the future to the events. I have made a
> simple test with a time window (https://github.com/juanrh/
> flink-state-eviction/commit/09c2c1fe1e6068b0703c0833b8a574313cdca5a2),
> and it looks like Flink treats early events like events generated at the
> current processing time. What it's the expected behaviour of Flink for
> early events?
>
> Early events might be interesting for generating test data, if Flink was
> able to buffer those early events until its actual time arrives, although I
> guess implementing that would probably impact the performance in
> production. But as I say, early events might happen in production because
> you can have wrong clocks or wrong code in general in the devices that
> generate the events. Maybe a fallback to ingestion time would make sense,
> and an approximation to that might be implemented with a timestamp
> extractor that overrides future timestamps with System.currenTimeMillis.
>
> Greetings,
>
> Juan
>

Mime
View raw message