flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Flink 1.2 time window operation
Date Thu, 30 Mar 2017 06:35:49 GMT
Hi Dominik,

Was the job running with processing time or event time? If event time, how are you producing
the watermarks?
Normally to understand how windows are firing in Flink, these two factors would be the place
to look at.
I can try to further explain this once you provide info with these. Also, are you using Kafka
0.10?

Cheers,
Gordon

On March 27, 2017 at 11:25:49 PM, Dominik Safaric (dominiksafaric@gmail.com) wrote:

Hi all,  

Lately I’ve been investigating onto the performance characteristics of Flink part of our
internal benchmark. Part of this we’ve developed and deployed an application that pools
data from Kafka, groups the data by a key during a fixed time window of a minute.  

In total, the topic that the KafkaConsumer pooled from consists of 100 million messages each
of 100 bytes size. What we were expecting is that no records will be neither read nor produced
back to Kafka for the first minute of the window operation - however, this is unfortunately
not the case. Below you may find a plot showing the number of records produced per second.
 

Could anyone provide an explanation onto the behaviour shown in the graph below? What are
the reasons behind consuming/producing messages from/to Kafka while the window has not expired
yet?  


Mime
View raw message