flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Tamillow <mikaeltamillo...@gmail.com>
Subject Re: Event processing time with lateness
Date Fri, 03 Jun 2016 20:02:06 GMT
Super cool stuff

On Fri, Jun 3, 2016 at 10:55 AM, Kostas Kloudas <k.kloudas@data-artisans.com
> wrote:

> You are welcome!
>
>
> On Jun 3, 2016, at 4:40 PM, Igor Berman <igor.berman@gmail.com> wrote:
>
> thanks Kosta
>
> On 3 June 2016 at 16:47, Kostas Kloudas <k.kloudas@data-artisans.com>
> wrote:
>
>> Hi Igor,
>>
>> To handle late events in Flink you would have to implement you own custom
>> trigger.
>>
>> To see a relatively more complex example of such a trigger and how to
>> implement it,
>> you can have a look at this implementation:
>> https://github.com/dataArtisans/beam_comp/blob/master/src/main/java/com/dataartisans/beam_comparison/customTriggers/EventTimeTriggerWithEarlyAndLateFiring.java
>>
>> Which implements the trigger described in this article (before the
>> conclusions section)
>> http://data-artisans.com/why-apache-beam/
>>
>> Thanks,
>> Kostas
>>
>> On Jun 3, 2016, at 2:55 PM, Igor Berman <igor.berman@gmail.com> wrote:
>>
>> Hi
>>
>> according to presentation of Tyler Akidau
>> https://docs.google.com/presentation/d/13YZy2trPugC8Zr9M8_TfSApSCZBGUDZdzi-WUz95JJw/present
Flink
>> supports late arrivals for window processing, while I've seen several
>> question in the userlist regarding late arrivals and answer was - sort of
>> "not for all usecases"
>> Can somebody clarify?
>>
>> The interesting case for me - I have event processing time, while I want
>> to aggregate by tumbling window. The events come from kafka and might be
>> late. Currently we define lateness threshold with watermark (e.g. 5 mins)
>>
>> After window triggers I want to save aggregated result at some persistent
>> storage(redis/hbase) with start timestamp of window
>>
>> After this grace period - if I understand correctly - any event won't be
>> aggregated into existing window, but rather the trigger will call
>> aggregated function with only 1 element inside(the late one)
>>
>> so if my window method saves into persistent storage - it will override
>> aggregated result with new one that has only 1 element inside
>>
>> what I want to achieve - is that late arrival will trigger window method
>> with all elements (late + all other) so that aggregated result will be
>> complete
>>
>> you can think about use case of page visits counts per minute, while due
>> to some problems page visit events might arrive late
>>
>> thanks in advance
>>
>>
>>
>>
>
>

Mime
View raw message