flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Processing windows in event time order
Date Thu, 21 Jul 2016 11:52:52 GMT
Yes, that is to be expected. Stream 2 should only send the watermark once
the elements with a timestamp lower than the watermark have been sent as
well.

On Thu, 21 Jul 2016 at 13:10 Sameer W <sameer@axiomine.com> wrote:

> Thanks, Aljoscha,
>
> This what I am seeing when I use Ascending timestamps as watermarks-
>
> Consider a window if 1-5 seconds
> Stream 1- Sends Elements A,B
>
> Stream 2 (20 seconds later) - Sends Elements C,D
>
> I see Window (1-5) fires first with just A,B. After 20 seconds Window
> (1-5) fires again but this time with only C,D. If I add a delay where I lag
> the watermarks by 20 seconds, then only one instance of the Window (1-5)
> fires with elements A,B,C,D.
>
> Sameer
>
> On Thu, Jul 21, 2016 at 5:17 AM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
>
>> Hi David,
>> windows are being processed in order of their end timestamp. So if you
>> specify an allowed lateness of zero (which will only be possible on Flink
>> 1.1 or by using a custom trigger) you should be able to sort the elements.
>> The ordering is only valid within one key, though, since windows for
>> different keys with the same end timestamp will be processed in an
>> arbitrary order.
>>
>> @Sameer If both sources emit watermarks that are correct for the elements
>> that they are emitting the Trigger should only fire when both sources
>> progressed their watermarks sufficiently far. Could you maybe give a more
>> detailed example of the problem that you described?
>>
>> Cheers,
>> Aljoscha
>>
>>
>> On Thu, 21 Jul 2016 at 04:03 Sameer Wadkar <sameer@axiomine.com> wrote:
>>
>>> Hi,
>>>
>>> If watermarks arriving from multiple sources, how long does the Event
>>> Time Trigger wait for the slower source to send its watermarks before
>>> triggering only from the faster source? I have seen that if one of the
>>> sources is really slow then the elements of the faster source fires and
>>> when the elements arrive from the slower source, the same window fires
>>> again with the new elements only. I can work around this by adding delays
>>> but does merging watermarks require that both have arrived by the time the
>>> watermarks progress to the point where a window can be triggered? Is
>>> applying a delay in the watermark the only way to solve this.
>>>
>>> Sameer
>>>
>>> Sent from my iPhone
>>>
>>> On Jul 20, 2016, at 9:41 PM, Vishnu Viswanath <
>>> vishnu.viswanath25@gmail.com> wrote:
>>>
>>> Hi David,
>>>
>>> You are right, the events in the window are not sorted according to the
>>> EventTime hence the processing is not done in an increasing order of
>>> timestamp.
>>> As you said, you will have to do the sorting yourself in your window
>>> function to make sure that you are processing the events in order.
>>>
>>> What Flink does is (when EventTime is set and timestamp is assigned), it
>>> will assign the elements to the Windows based on the EventTime, which
>>> otherwise (if using ProcessingTime) might have ended up in a different
>>> Window. (as per the ProcessingTime).
>>>
>>> This is as per my limited knowledge, other Flink experts can correct me
>>> if this is wrong.
>>>
>>> Thanks,
>>> Vishnu
>>>
>>> On Wed, Jul 20, 2016 at 9:30 PM, David Desberg <david.desberg@uber.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> In Flink, after setting the time characteristic to event time and
>>>> properly assigning timestamps/watermarks, time-based windows will be
>>>> created based upon event time. If we need to process events within a window
>>>> in event time order, we can sort the windowed values and process as
>>>> necessary by applying a WindowFunction. However, as I understand it, there
>>>> is no guarantee that time-based windows will be processed in time order.
Is
>>>> this correct? Or, if we assume a watermarking system that (for example's
>>>> sake) does not allow any late events, is there a way within Flink to
>>>> guarantee that windows will be processed (via an applied WindowFunction)
in
>>>> strictly increasing time order?
>>>>
>>>> If necessary, I can provide a more concrete explanation of what I
>>>> mean/am looking for.
>>>>
>>>> Thanks!
>>>> David
>>>
>>>
>>>
>

Mime
View raw message