flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Kafka partition alignment for event time
Date Tue, 09 Feb 2016 09:21:45 GMT
Hi,
in general it should not be a problem if one parallel instance of a sink is responsible for
several Kafka partitions. It can become a problem if the timestamps in the different partitions
differ by a lot and the watermark assignment logic is not able to handle this.

How are you assigning the timestamps/watermarks in your job?

Cheers,
Aljoscha
> On 08 Feb 2016, at 21:51, shikhar <shikhar@schmizz.net> wrote:
> 
> Stephan explained in that thread that we're picking the min watermark when
> doing operations that join streams from multiple sources. If we have m:n
> partition-source assignment where m>n, the source is going to end up with
> the max watermark. Having m<=n ensures that the lowest watermark is used.
> 
> Re: automatic enforcement, perhaps allowing for more than 1 Kafka partition
> on a source should require opt-in, e.g. allowOversubscription()
> 
> 
> 
> --
> View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-partition-alignment-for-event-time-tp4782p4788.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.


Mime
View raw message