flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Epping <stephan.epp...@zweitag.de>
Subject Re: Maintaining watermarks per key, instead of per operator instance
Date Thu, 10 Nov 2016 10:01:07 GMT
Hello,

I found this question in the Nabble archive (http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Maintaining-watermarks-per-key-instead-of-per-operator-instance-tp7288.html
<http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Maintaining-watermarks-per-key-instead-of-per-operator-instance-tp7288.html>)
but was unable/dont know how to reply.

Here is my question regarding the mentioned thread:

> Hello, 
> 
> I have similar requirements (see StackOverflor http://stackoverflow.com/questions/40465335/apache-flink-multiple-window-aggregations-and-late-data
<http://stackoverflow.com/questions/40465335/apache-flink-multiple-window-aggregations-and-late-data>).
I am pretty new to flink, could you elaborate on a possible solution? We can guarantee good
ordering by sensor_id, thus watermarking by key would be the only reasonable way for us (sensorData.keyBy('id').timeWindow(1.minute).sum('value')),
could I do my own watermarking aftersensorData.keyBy('id').overwriteWatermarking()... per
key? Or maybe using custom state plus a custom trigger? What happens if a sensor dies or is
being removed completely, how can this be detected as watermarks would be ignored for window
garbage collection. Or could we dynamically schedule a job of each sensor? Which would result
in 1000 Jobs.


Thanks,
Stephan



Mime
View raw message