flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Kafka and parallelism
Date Mon, 05 Feb 2018 08:52:59 GMT
Hi Christophe,

You can set the parallelism of the FlinkKafkaConsumer independently of the total number of
Kafka partitions (across all subscribed streams, including newly created streams that match
a subscribed pattern).

The consumer deterministically assigns each partition to a single consumer subtask, in a round-robin
E.g. if the parallelism of your FlinkKafkaConsumer is 2, and there is 6 partitions, each consumer
subtask will be assigned 3 partitions.

As for topic pattern subscription, FlinkKafkaConsumers starting from version 1.4.0 support
this feature. You can take a look at [1] on how to do that.

Hope this helps!


[1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/connectors/kafka.html#kafka-consumers-topic-and-partition-discovery

On 3 February 2018 at 6:53:47 PM, Christophe Jolif (cjolif@gmail.com) wrote:


If I'm sourcing from a KafkaConsumer do I have to explicitly set the Flink job parallelism
to the number of partions or will it adjust automatically accordingly? In other word if I
don't call setParallelism will get 1 or the number of partitions?

The reason I'm asking is that I'm listening to a topic pattern not a single topic and the
number of actual topic (and so partitions) behind the pattern can change so it is not possible
to know ahead ot time how many partitions I will get.

View raw message