flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Flink Kafka more consumers than partitions
Date Wed, 03 Aug 2016 09:35:32 GMT
Hi!

That is interesting, indeed. The idle sources should not create
backpressure. In fact, sources cannot create back pressure, because back
pressure pressures backwards and there is nothing backwards from the
sources ;-)

Do you adjust also the parallelism of the operator that interacts with
HBase, or just the source parallelism?

Cheers!

On Wed, Aug 3, 2016 at 10:14 AM, neo21 zerro <neo21_zerro@yahoo.com> wrote:

> Hello everybody,
>
> I'm using Flink Kafka consumer 0.8.x with kafka 0.8.2 and flink 1.0.3 on
> YARN.
> In kafka I have a topic which have 20 partitions and my flink topology
> reads from kafka (source) and writes to hbase (sink).
>
> when:
>      1. flink source has parallelism set to 40 (20 of the tasks are idle)
> I see 10.000 requests/sec on hbase
>      2. flink source has parallelism set to 20 (exact number of
> partitions) I see 100.0000 requests/sec on hbase (so a 10x improvement)
>
>
> It's clear that hbase is not the limiting factor in my topology.
> Assumption: Flink backpressure mechanism kicks in in the 1. case more
> aggressively and it's limiting the ingestion of tuples in the topology.
>
> The question: In the first case, why are those 20 sources which are
> sitting idle contributing so much to the backpressure?
>
>
> Thanks guys!
>

Mime
View raw message