flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Timo Walther <twal...@apache.org>
Subject Re: Task Manager fault tolerance does not work
Date Tue, 03 Apr 2018 13:26:09 GMT
@Till: Do you have any advice for this issue?

Am 03.04.18 um 11:54 schrieb dhirajpraj:
> What I have found is that the TM fault tolerance behaviour is not consistent.
> Sometimes it works and sometimes it doesnt. I am attaching my java code file
> (which is the main class).
> What I did was:
> 1) Run cluster with JM on machine A, one TM on machine B and one TM on
> machine C
> 2) Submit a job to the cluster. Works fine till now.
> 3) Forcefully kill the TM on machine C. The web UI shows job failing and
> then restarting and finally the job is up on its own. This is perfect.
> 4) Now I start the TM on machine C and wait for sufficient time
> 5) Now kill the TM on machine B. At this point the job fails. Shouldnt the
> job be handled by the running TM on machine C? FlinkPatternDetection.java
> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/file/t1400/FlinkPatternDetection.java>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

View raw message