flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arvid Heise <ar...@ververica.com>
Subject Re: Flink Streaming Job Tuning help
Date Mon, 18 May 2020 09:49:59 GMT
Hi Senthil,

since your records are so big, I recommend to take the time to evaluate
some different serializers [1].

[1]
https://flink.apache.org/news/2020/04/15/flink-serialization-tuning-vol-1.html

On Wed, May 13, 2020 at 5:40 PM Senthil Kumar <senthilku@vmware.com> wrote:

> Zhijiang,
>
>
>
> Thanks for your suggestions. We will keep it in mind!
>
>
>
> Kumar
>
>
>
> *From: *Zhijiang <wangzhijiang999@aliyun.com>
> *Reply-To: *Zhijiang <wangzhijiang999@aliyun.com>
> *Date: *Tuesday, May 12, 2020 at 10:10 PM
> *To: *Senthil Kumar <senthilku@vmware.com>, "user@flink.apache.org" <
> user@flink.apache.org>
> *Subject: *Re: Flink Streaming Job Tuning help
>
>
>
> Hi Kumar,
>
>
>
> I can give some general ideas for further analysis.
>
>
>
> > We are finding that flink lags seriously behind when we introduce the
> keyBy (presumably because of shuffle across the network)
>
> The `keyBy` would break the chained operators, so it might bring obvious
> performance sensitive in practice. I guess if your previous way without
> keyBy can make use of chained mechanism,
>
> the follow-up operator can consume the emitted records from the
> preceding operator directly, no need to involve in buffer serialization->
> network shuffle -> buffer deserializer processes,
>
> especially your record size 10K is a bit large.
>
>
>
> If the keyBy is necessary in your case, then you can further check the
> current bottleneck. E.g. whether there are back pressure which you can
> monitor from web UI. If so, which task is the
>
> bottleneck to cause the back pressure, and you can trace it by network
> related metrics.
>
>
>
> Whether there are data skew in your case, that means some task would
> process more records than others. If so, maybe we can increase the
> parallelism to balance the load.
>
>
>
> Best,
>
> Zhijiang
>
> ------------------------------------------------------------------
>
> From:Senthil Kumar <senthilku@vmware.com>
>
> Send Time:2020年5月13日(星期三) 00:49
>
> To:user@flink.apache.org <user@flink.apache.org>
>
> Subject:Re: Flink Streaming Job Tuning help
>
>
>
> I forgot to mention, we are consuming said records from AWS kinesis and
> writing out to S3.
>
>
>
> *From: *Senthil Kumar <senthilku@vmware.com>
> *Date: *Tuesday, May 12, 2020 at 10:47 AM
> *To: *"user@flink.apache.org" <user@flink.apache.org>
> *Subject: *Flink Streaming Job Tuning help
>
>
>
> Hello Flink Community!
>
>
>
> We have a fairly intensive flink streaming application, processing 8-9
> million records a minute, with each record being 10k.
>
> One of our steps is a keyBy operation. We are finding that flink lags
> seriously behind when we introduce the keyBy (presumably because of shuffle
> across the network).
>
>
>
> We are trying to tune it ourselves (size of nodes, memory, network buffers
> etc), but before we spend way too much time on
>
> this; would it be better to hire some “flink tuning expert” to get us
> through?
>
>
>
> If so what resources are recommended on this list?
>
>
>
> Cheers
>
> Kumar
>
>
>


-- 

Arvid Heise | Senior Java Developer

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Toni) Cheng

Mime
View raw message