flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Juan Rodríguez Hortalá <juan.rodriguez.hort...@gmail.com>
Subject Re: Early events
Date Tue, 22 Nov 2016 05:03:17 GMT
That makes sense, thanks for your answer.



On Mon, Nov 21, 2016 at 9:11 AM, Aljoscha Krettek <aljoscha@apache.org>

> Hi,
> yes, Flink is expected to buffer those until the watermark catches up with
> their timestamp.
> Cheers,
> Aljoscha
> On Sun, 20 Nov 2016 at 06:18 Juan Rodríguez Hortalá <
> juan.rodriguez.hortala@gmail.com> wrote:
>> Hi,
>> There was a bug in my code, I was assigning the timestamps wrong and that
>> is why it looked like early events where assigned processing time.
>> Surprisingly enought my test works both ok with early events. In fact I
>> have modified my test data generator to generate early events or late
>> events, and both seem to work ok with my test (https://github.com/juanrh/
>> flink-state-eviction/blob/293fe1cf972b2e4bc6fb4e874eb8ba
>> 70c78f7894/src/main/java/com/github/juanrh/streaming/source/
>> EventTimeDelayedElementsSource.java, https://github.com/juanrh/
>> flink-state-eviction/blob/293fe1cf972b2e4bc6fb4e874eb8ba
>> 70c78f7894/src/test/java/com/github/juanrh/streaming/source/
>> EventTimeDelayedElementsSourceTest.java)
>> Anyway, is this the expected behaviour for early events? Is Flink
>> buffering early events until their future timestamp arrives?
>> Thanks,
>> Juan
>> On Sat, Nov 19, 2016 at 8:31 PM, Juan Rodríguez Hortalá <
>> juan.rodriguez.hortala@gmail.com> wrote:
>> Hi,
>> Maybe this is already in the documentation, sorry if I'm asking something
>> obvious. I was thinking that if you have event time then you can also have
>> early events, which would be events whose extracted timestampt is in the
>> future. This might happen in practice for example in sensors with a skewed
>> clock, that assign timestamps in the future to the events. I have made a
>> simple test with a time window (https://github.com/juanrh/
>> flink-state-eviction/commit/09c2c1fe1e6068b0703c0833b8a574313cdca5a2),
>> and it looks like Flink treats early events like events generated at the
>> current processing time. What it's the expected behaviour of Flink for
>> early events?
>> Early events might be interesting for generating test data, if Flink was
>> able to buffer those early events until its actual time arrives, although I
>> guess implementing that would probably impact the performance in
>> production. But as I say, early events might happen in production because
>> you can have wrong clocks or wrong code in general in the devices that
>> generate the events. Maybe a fallback to ingestion time would make sense,
>> and an approximation to that might be implemented with a timestamp
>> extractor that overrides future timestamps with System.currenTimeMillis.
>> Greetings,
>> Juan

View raw message