flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Weihua Hu <huweihua....@gmail.com>
Subject Re: Singal task backpressure problem with Credit-based Flow Control
Date Mon, 25 May 2020 02:39:42 GMT
Hi, Zhijiang

I understand the normal credit-based backpressure mechanism. as usual the Sink inPoolUsage
will be full, and the task stack will also have some information. 
but this time is not the same. The Sink inPoolUsage is 0. 
I also checked the stack. The Map is waiting org.apache.flink.runtime.io.network.buffer.LocalBufferPool.requestMemorySegment
The Sink is waiting data to deal, this is not very in line with expectations.








Best
Weihua Hu

> 2020年5月24日 21:57,Zhijiang <wangzhijiang999@aliyun.com> 写道:
> 
> Hi Weihua,
> 
> From your below info, it is with the expectation in credit-based flow control. 
> 
> I guess one of the sink parallelism causes the backpressure, so you will see that there
are no available credits on Sink side and
> the outPoolUsage of Map is almost 100%. It really reflects the credit-based states in
the case of backpressure.
> 
> If you want to analyze the root cause of backpressure, you can trace the task stack of
respective Sink parallelism to find which operation costs much,
> then you can increase the parallelism or improve the UDF(if have bottleneck) to have
a try. In addition, i am not sure why you choose rescale to shuffle data among operators.
The default
> forward mode can gain really good performance by default if you adjusting the same parallelism
among them.
> 
> Best,
> Zhijiang
> ------------------------------------------------------------------
> From:Weihua Hu <huweihua.ckl@gmail.com>
> Send Time:2020年5月24日(星期日) 18:32
> To:user <user@flink.apache.org>
> Subject:Singal task backpressure problem with Credit-based Flow Control
> 
> Hi, all
> 
> I ran into a weird single Task BackPressure problem.
> 
> JobInfo:
>     DAG: Source (1000)-> Map (2000)-> Sink (1000), which is linked via rescale.

>     Flink version: 1.9.0
>     
> There is no related info in jobmanager/taskamanger log.
> 
> Through Metrics, I see that Map (242) 's outPoolUsage is full, but its downstream Sink
(121)' s inPoolUsage is 0.
> 
> After dumping the memory and analyzing it, I found:
> Sink (121)'s RemoteInputChannel.unannouncedCredit = 0,
> Map (242)'s CreditBasedSequenceNumberingViewReader.numCreditsAvailable = 0.
> This is not consistent with my understanding of the Flink network transmission mechanism.
> 
> Can someone help me? Thanks a lot.
> 
> 
> Best
> Weihua Hu
> 
> 


Mime
View raw message