flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: 20 times higher throughput with Window function vs fold function, intended?
Date Fri, 31 Mar 2017 09:19:32 GMT
The 1,2million seems to be European notation. 

You meant 1.2 million, right ?

> On Mar 31, 2017, at 1:19 AM, Kamil Dziublinski <kamil.dziublinski@gmail.com> wrote:
> 
> Hi,
> 
> Thanks for the tip man. I tried playing with this.
> Was changing fetch.message.max.bytes (I still have 0.8 kafka) and also socket.receive.buffer.bytes.
With some optimal settings I was able to get to 1,2 million reads per second. So 50% increase.

> But that unfortunately does not increase when I enable hbase sink again. So it means
that backpressure kicks in and hbase writing is here limiting factor. I will try to tweak
this a bit more if I find something I will share.
> 
> Cheers,
> Kamil.
> 
> On Thu, Mar 30, 2017 at 12:45 PM, Tzu-Li (Gordon) Tai <tzulitai@apache.org> wrote:
>>> I'm wondering what I can tweak further to increase this. I was reading in this
blog: https://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>> about 3 millions per sec with only 20 partitions. So i'm sure I should be able
to squeeze out more out of it.
>> 
>> 
>> Not really sure if it is relevant under the context of your case, but you could perhaps
try tweaking the maximum size of Kafka records fetched on each poll on the partitions.
>> You can do this by setting a higher value for “max.partition.fetch.bytes” in
the provided config properties when instantiating the consumer; that will directly configure
the internal Kafka clients.
>> Generally, all Kafka settings are applicable through the provided config properties,
so you can perhaps take a look at the Kafka docs to see what else there is to tune for the
clients.
>> 
>>> On March 30, 2017 at 6:11:27 PM, Kamil Dziublinski (kamil.dziublinski@gmail.com)
wrote:
>>> 
>>> I'm wondering what I can tweak further to increase this. I was reading in this
blog: https://data-artisans.com/extending-the-yahoo-streaming-benchmark/
>>> about 3 millions per sec with only 20 partitions. So i'm sure I should be able
to squeeze out more out of it.
> 

Mime
View raw message