flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Generating processing time watermarks in idle event time kafka streams?
Date Fri, 14 Dec 2018 09:51:01 GMT
Hi William,

there is currently no official way of doing this but the Flink community will be working on
this as part of an upcoming source refactoring.

For now, there is this watermark extractor that I once did: https://github.com/aljoscha/flink/commit/6e4419e550caa0e5b162bc0d2ccc43f6b0b3860f


> On 14. Dec 2018, at 09:52, William Saar <william@saar.se> wrote:
> Any standardized components to generate watermarks based on processing time in an event
time stream when there is no data from a source?
> The docs for event time [1] indicate that people are doing this, but the only suggestion
on Stack Overflow [2] is to make every window operator in stream have a special processing
time trigger, which seems clumsy and error prone if you have many jobs and many windows.
> I'm thinking there might be a watermark generator that can be plugged into the kafka
consumer to make it generate processing time watermarks when there are no messages? Or maybe
a standardized operator that can be inserted into the beginning of each stream, just after
the source, to generate such idle-source processing time watermarks in event time streams?
> 1. "Note that sometimes when event time programs are processing live data in real-time,
they will use some processing time operations in order to guarantee that they are progressing
in a timely fashion." 
> 2. https://stackoverflow.com/questions/46432368/flink-window-does-not-process-data-at-end-of-stream


View raw message