flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bowen Li <bowen...@offerupnow.com>
Subject Even out the number of generated windows
Date Fri, 25 Aug 2017 04:23:07 GMT
Hi guys,

I do have a question for how Flink generates windows.

We are using a 1-day sized sliding window with 1-hour slide to count some
features of items based on event time. We have about 20million items. We
observed that Flink only emit results on a fixed time in an hour (e.g. 1am,
2am, 3am,  or 1:15am, 2:15am, 3:15am with a 15min offset). That's means
20million windows/records are generated at the same time every hour, which
burns down our sink. But nothing is generated in the rest of that hour. The
pattern is like this:

# generated windows
|
|    /\                  /\
|   /  \                /  \
|_/__\_______/__\_
                                 time

Is there any way to even out the number of generated windows/records in an
hour? Can we have evenly distributed generated load like this?

# generated windows
|
|
| ------------------------
|_______________
                                 time

Thanks,
Bowen

Mime
View raw message