flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "LINZ, Arnaud" <AL...@bouyguestelecom.fr>
Subject RE: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1
Date Tue, 15 Mar 2016 11:01:58 GMT
Hi,

All right… I find this new behavior dangerous since you’ll always miss the last elements
of a source that does not last forever if you use processing time windows.
I’ve created a source wrapper that sleeps at the end of the last element so that unit test
that use processing time work.

Cheers,
Arnaud


De : Till Rohrmann [mailto:trohrmann@apache.org]
Envoyé : lundi 14 mars 2016 15:11
À : user@flink.apache.org
Objet : Re: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1


Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite stream was slightly
changed. If you use event time, then all unfinished windows are triggered in case that your
stream ends. This can be motivated by the fact that the end of a stream is equivalent to no
elements will arrive until the maximum time (infinity) has been reached. This knowledge, allows
you to emit a Long.MaxValue watermark when an event time stream is finished, which will trigger
all lingering windows.

In contrast to event time, you cannot say the same about a finished processing time stream.
There we don’t have logical time but the actual processing time we use to reason about windows.
When a stream finishes, then we cannot fast forward the processing time to a point where the
windows will fire. This can only happen if we keep the operators alive until the wall clock
tells us that it’s time to fire the windows. However, there is no such feature implemented
yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till
​

On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud <ALINZ@bouyguestelecom.fr<mailto:ALINZ@bouyguestelecom.fr>>
wrote:
Hello,

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in some  of my
unit tests.

To narrow the problem, here is what I’ve figured out:


-          I use a simple Streaming application with a source defined as “fromElements("Element
1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : timeWindowAll(Time.seconds(3))

-          I use an apply() function and counts the total number of elements I get with a
global counter

With the previous version, I got all three elements because, not because they are  triggered
under 3 seconds, but because the source ends
With the 1.0 version, I don’t get any elements, and that’s annoying because as the source
ends the application ends even if I sleep 5 seconds after the execute() method.

(If I replace fromElement with fromCollection with a 10000 element list and Time.second(3)
with Time.millisecond(1), I get a random number of elements)

Is this behavior wanted ? If yes, how do I get my last elements now ?

Best regards,
Arnaud




________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société expéditrice
ne peut être tenue responsable de son contenu ni de ses pièces jointes. Toute utilisation
ou diffusion non autorisée est interdite. Si vous n'êtes pas destinataire de ce message,
merci de le détruire et d'avertir l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company that sent
this message cannot therefore be held liable for its content nor attachments. Any unauthorized
use or dissemination is prohibited. If you are not the intended recipient of this message,
then please delete it and notify the sender.

Mime
View raw message