flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dominik Safaric <dominiksafa...@gmail.com>
Subject TaskManager failure detection
Date Wed, 22 Feb 2017 11:05:15 GMT
Hi,

As I’m investigating onto Flink’s fault tolerance capabilities, I would like to know what
component and class is in charge of TaskManager failure detection and checkpoint restoring?
In addition, how does Flink actually determine that a TaskManager has failed due to e.g. hardware
failures? 

Up to my knowledge, the state should be restored using the CheckpointCoordinator or ExecutionGraph.
Correct me if I’m wrong. 

Thanks in advance,
Dominik


Mime
View raw message