flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Monitoring backpressure
Date Mon, 07 Dec 2015 09:51:04 GMT
I discussed about this quite a bit with other people.

It is not totally straightforward. One could try and measure exhaustion of
the output buffer pools, but that fluctuates a lot - it would need some
work to get a stable metric from that...

If you have a profiler that you can attach to the processes, you could
check whether a lot of time is spent within the "requestBufferBlocking()"
method of the buffer pool...

Stephan


On Mon, Dec 7, 2015 at 9:45 AM, Gyula Fóra <gyfora@apache.org> wrote:

> Hey guys,
>
> Is there any way to monitor the backpressure in the Flink job? I find it
> hard to debug slow operators because of the backpressure mechanism so it
> would be good to get some info out of the network layer on what exactly
> caused the backpressure.
>
> For example:
>
> task1 -> task2 -> task3 -> task4
>
> I want to figure out whether task 2 or task 3 is slow.
>
> Any ideas?
>
> Thanks,
> Gyula
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message