flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From vino yang <yanghua1...@gmail.com>
Subject Re: AskTimeoutException when canceling job with savepoint on flink 1.6.0
Date Thu, 06 Sep 2018 02:24:41 GMT
Hi Jelmer,

Here's a similar question, and you can refer to the discussion options.[1]

[1]:
http://mail-archives.apache.org/mod_mbox/flink-user/201808.mbox/%3CCAMJEyBa9zJX_huqTLxDCu87hpHRVRzXZoYJpQxzXDkQ2H_kiig@mail.gmail.com%3E

Hi Till and Chesnay,

Recently, several users have encountered this problem in the past month.
Maybe the community should give priority to the stability of this part or
list the guidelines in the official document FAQ?

Thanks, vino.

jelmer <jkuperus@gmail.com> 于2018年9月5日周三 下午8:48写道:

> I am trying to upgrade a job from flink 1.4.2 to 1.6.0
>
> When we do a deploy we cancel the job with a savepoint then deploy the new
> version of the job from that savepoint. Because our jobs tend to have a lot
> of state it often takes multiple minutes for our savepoints to complete.
>
> On flink 1.4.2 we set *akka.client.timeout* to a high value to make sure
> the request did not timeout
>
> However on flink 1.6.0 I get an *AskTimeoutException*  and increasing
> *akka.client.timeout* only works if i apply it to the running flink
> process.
> Applying it to just the flink client does nothing.
>
> I am reluctant to configure this on the container itself because afaik it
> applies to everything inside of flink's internal actor system not just to
> creating savepoints.
>
> What is the correct way to use cancel with savepoint for jobs with lots of
> state in flink 1.6.0 ?
>
> I Attached the error.
>
>
>
>

Mime
View raw message