flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Istvan Soos <istvan.s...@gmail.com>
Subject fault tolerance: suspend and resume?
Date Wed, 27 Jul 2016 08:10:36 GMT
Hi,

I was wondering how Flink's fault tolerance works, because this page
is short on the details:
https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/fault_tolerance.html

My environment has a backend service that may be out for a couple of
hours (sad, but working on fixing that). I have a sink that would like
to write to that service, and in such cases it throws an exception.
This brings the process down and I need to manually intervene to get
it up and running again.

I was thinking to rewrite the sink to loop until it is able to write
the data (and have a multi-hour long tolarence before it throws an
exception). I hope that it will create a backpressure on the process,
"suspend" the processing and "resume" it when the backend service goes
up again.

Am I right with that assumption? Is there a better way to make
suspending and resuming automatic?

Thanks,
  Istvan

Mime
View raw message