flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kien Truong <duckientru...@gmail.com>
Subject Re: High back-pressure after recovering from a save point
Date Fri, 14 Jul 2017 09:18:20 GMT

Sorry for the version typo, I'm running 1.3.1. I did not test with 1.2.x.

The jobs runs fine with almost 0 back-pressure if it's started from scratch or if I reuse
the kafka consumers group id without specifying the safe point. 

Best regards, 

On Jul 14, 2017, 15:59, at 15:59, Stephan Ewen <sewen@apache.org> wrote:
>Flink 1.3.2 does not yet exist. Do you mean 1.3.1 or latest master?
>Can you tell us whether this occurs only in 1.3.x and worked well in
>If you just keep the job running without savepoint/restore, you do not
>into backpressure situations?
>On Fri, Jul 14, 2017 at 1:15 AM, Kien Truong <duckientruong@gmail.com>
>> Hi Fabian,
>> This happens to me even when the restore is immediate, so there's not
>> data in Kafka to catch up (5 minutes max)
>> Regards
>> Kien
>> On Jul 13, 2017, at 23:40, Fabian Hueske <fhueske@gmail.com> wrote:
>>> I would guess that this is quite usual because the job has to
>>> work.
>>> For example, if you took a save point two days ago and restore the
>>> today, the input data of the last two days has been written to Kafka
>>> (assuming Kafka as source) and needs to be processed.
>>> The job will now read as fast as possible from Kafka to catch-up to
>>> presence. This means the data is much fast ingested (as fast as
>Kafka can
>>> read and ship it) than during regular processing (as fast as your
>>> produce).
>>> The processing speed is bound by your Flink job which means there
>will be
>>> backpressure.
>>> Once the job caught-up, the backpressure should disappear.
>>> Best, Fabian
>>> 2017-07-13 15:48 GMT+02:00 Kien Truong <duckientruong@gmail.com>:
>>>> Hi all,
>>>> I have one job where back-pressure  is significantly higher after
>>>> resuming from a save point.
>>>> Because that job makes heavy use of stateful functions with
>>>> RocksDBStateBackend ,
>>>> I'm suspecting that this is the cause of performance degradation.
>>>> Does anyone encounter simillar issues or have any tips for
>debugging ?
>>>> I'm using Flink 1.3.2 with YARN in detached mode.
>>>> Regards,
>>>> Kien

View raw message