flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nico Kruber <n...@data-artisans.com>
Subject Re: Can't get my job restarted on job manager failures
Date Tue, 20 Jun 2017 14:34:24 GMT
Hi Mike,
have you configured zookeeper [1] ? afaik, it is required for a high-
availability (YARN) session and is used to store JobManager state. Without it, 
a recovery would not know what to recover from.


Nico

[1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/
jobmanager_high_availability.html#yarn-cluster-high-availability

On Tuesday, 20 June 2017 13:23:35 CEST Mikhail Pryakhin wrote:
> Hello,
> 
> I'm currently trying to check whether my job is restarted in case of Job
> Manager failure. The job is submitted as a single job on YARN with the
> following options set in the flink-conf.yaml:
> 
> restart-strategy: fixed-delay
> restart-strategy.fixed-delay.attempts: 3
> restart-strategy.fixed-delay.delay: 10 s
> 
> Then I kill the Job Manager container. After that YARN starts a new Job
> Manager container but the job is not started. What am I doing wrong? Do I
> need something else to be configured to enable job restarts on JM failure?
> 
> I'm using flink 1.3 Hadoop 2.6
> 
> Thanks in advance.
> 
> Kind Regards,
> Mike Pryakhin


Mime
View raw message