flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Lamirault <thomas.lamira...@ericsson.com>
Subject RE:Flink HA
Date Fri, 19 Feb 2016 08:39:44 GMT
Thanks for the quick reply !

> state.backend.fs.checkpointdir
Is actually pointing to a hdfs directory but I will modify  the recovery.zookeeper.path.root

> This is only relevant if you are using YARN. From your complete
Yes, I omit to say we will use YARN.

>Does this help?
Yes, a lot :-)


De : Ufuk Celebi [uce@apache.org]
Envoyé : jeudi 18 février 2016 19:19
À : user@flink.apache.org
Objet : Re: Flink HA

On Thu, Feb 18, 2016 at 6:59 PM, Thomas Lamirault
<thomas.lamirault@ericsson.com> wrote:
> We are trying flink in HA mode.

Great to hear!

> We set in the flink yaml :
> state.backend: filesystem
> recovery.mode: zookeeper
> recovery.zookeeper.quorum:<Our quorum>
> recovery.zookeeper.path.root: <path>
> recovery.zookeeper.storageDir: <storageDir>
> recovery.backend.fs.checkpointdir: <pathcheckpoint>

It should be state.backend.fs.checkpointdir.

Just to check: Both state.backend.fs.checkpointdir and
recovery.zookeeper.path.root should point to a file system path.

> yarn.application-attempts: 100

This is only relevant if you are using YARN. From your complete

> We want in case of application crash, the pending window has to be restore
> when the application restart.
> Pending data are store into the <storageDir>/blob directory ?
> Also, we try to write a script who restart the application after exceed the
> max attempts, with the last pending window.
> How can I do that ? A simple restart of the application is enough, or do I
> have to "clean" the recovery.zookeeper.path.root ?

Restore happens automatically to the most recently checkpointed state.

Everything under <storageDir> contains the actual state (including
JARs and JobGraph). ZooKeeper contains pointers to this state.
Therefore, you must not delete the ZooKeeper root path.

For the automatic restart, I would recommend using YARN. If you want
to do it manually, you need to restart the JobManager/TaskManager
instances. The application will be recovered automatically from
ZooKeeper/state backend.

Does this help?

– Ufuk
View raw message