flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve <sdou...@gmail.com>
Subject Sensor's Data Aggregation Out of order with EventTime.
Date Thu, 06 Oct 2016 08:53:45 GMT
Hi all,
We have some sensors that sends data into kafka. Each kafka partition have a
set of deferent sensor writing data in it. We consume the data from flink.
We want to SUM up the values in half an hour intervals in eventTime(extract
from data).
The result is a keyed stream by sensor_id with timeWindow of 30 minutes.

Usually when you have to deal with Sensor data you have a priori accept that
your data will be ordered by timestamp for each sensor id. A characteristic
example of data arriving is described below. 

The problem is that a watermark generated is closing the windows before all
nodes have finished(7record).


Is this scenario possible with EventTime? Or we need to choose another
technique? We thought that it could be done by using one kafka partition for
each sensor, but this would result in thousand partitions in kafka, which
may be inefficient. Could you propose a possible solution for these kind of
data arriving?

View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Sensor-s-Data-Aggregation-Out-of-order-with-EventTime-tp9361.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

View raw message