flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Flink - Handling late events - main vs late window firing
Date Tue, 15 Aug 2017 06:26:27 GMT
I would say so, yes. But I don't consider ProessWindowFunction to be low-level, it's just the
function that should be used for processing windows if you need more information about context.

Best,
Aljoscha

> On 14. Aug 2017, at 22:53, M Singh <mans2singh@yahoo.com> wrote:
> 
> Thanks Aljoscha for your response.
> 
> Just to clarify - the only way to handle the duplication scenario properly is by using
the ProcessWindowFunction - there is no high level function for this.
> 
> Thanks again.
> 
> 
> On Wednesday, August 9, 2017 6:26 AM, Aljoscha Krettek <aljoscha@apache.org> wrote:
> 
> 
> Hi,
> 
> 1. You could use a ProcessWindowFunction instead of a WindowFunction. In there, you can
query the current watermark and thus determine why the firing is happening. Also, in a ProcessWindowFunction
you can keep per-window state, this would allow you to keep a bit of state that can tell you
whether this is the first firing for a given window or the number of firings so far.
> 
> 2. This depends on whether the Trigger is purging or not. The default EventTimeTrigger
is not purging, meaning that all elements in the window will be preserved after firing (until
the watermark reaches the end of the window plus the allowed lateness). You can turn this
into a purging trigger using PurgingTrigger.of(EventTimeTrigger.create()). You would specify
this using .trigger() on WindowedStream when constructing your windowed operation.
> 
> 3. It doesn't, you have to manually keep state in a ProcessWindowFunction to distinguish
between different cases, as mentioned above.
> 
> 4. Currently, I think there are no examples because this depends to a large degree on
the specifics of the application. I'm afraid.
> 
> Best,
> Aljoscha
> 
>> On 6. Aug 2017, at 18:45, M Singh <mans2singh@yahoo.com <mailto:mans2singh@yahoo.com>>
wrote:
>> 
>> Hi Folks:
>> 
>> I am going through flink documentation and it states the following:
>> 
>> "You should be aware that the elements emitted by a late firing should be treated
as updated results of a previous computation, i.e., your data stream will contain multiple
results for the same computation. Depending on your application, you need to take these duplicated
results into account or deduplicate them."
>> 
>> I wanted to find out the following:
>> 
>> 1. How do we distinguish the late firing from the main firing ?
>> 2. Does the late firing including all events or only late events ?
>> 3. How does the late vs main firing affect the associated window function ?
>> 4. Are there any examples of how to handle these events and deduplication mentioned
in the documentation ?
>> 
>> Thanks for your help.
>> 
>> Mans
> 
> 
> 


Mime
View raw message