flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Watermarks per key
Date Tue, 21 Feb 2017 10:24:34 GMT
Hi,
managing a per-key watermark would require keeping to current watermark for
each key, for example at the sources or in a timestamp/watermark assigner.
The problem then is figuring out when you can discard that state because it
would otherwise grow indefinitely if you have an evolving key space.

You can simulate per-key watermarks by having a wrapping type in your
pipeline that carries the watermarks. Something like ValueOrWatermark<K, V>
and then in your operators you manually manage the watermark, per-key in
your own state. You would run into the same problem of the evolving
key-space, however.

Cheers,
Aljoscha

On Mon, 20 Feb 2017 at 22:27 jganoff <jordan@corvana.com> wrote:

> There's nothing stopping me assigning timestamps and generating watermarks
> on
> a keyed stream in the implementation and the KeyedStream API supports it.
> It
> appears the underlying operator that gets created in
> DataStream.assignTimestampsAndWatermarks() isn't key-aware and globally
> tracks timestamps. So is that what's technically preventing assigning
> timestamps per key from working?
>
> I'm curious to hear Aljoscha's thoughts on watermark management across
> keys.
>
> Thanks!
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Watermarks-per-key-tp11628p11761.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Mime
View raw message