flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kumar Bolar, Harshith" <hk...@arity.com>
Subject Should the entire cluster be restarted if a single Task Manager crashes?
Date Fri, 18 Jan 2019 09:52:51 GMT
Hi all,

We're running a standalone Flink cluster with 2 Job Managers and 3 Task Managers. Whenever
a TM crashes, we simply restart that particular TM and proceed with the processing.

But reading the comments on this<https://stackoverflow.com/questions/54149134/what-happen-to-state-in-flink-task-manager-when-crash>
question makes it look like we need to restart all the 5 nodes that form a cluster to deal
with the failure of a single TM. Am I reading this right? What would be the consequences if
we restart just the crashed TM and let the healthy ones run as is?

Thanks,
Harshith

Mime
View raw message