flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florian König <florian.koe...@micardo.com>
Subject Re: Telling if a job has caught up with Kafka
Date Fri, 17 Mar 2017 10:07:19 GMT

thank you Gyula for posting that question. I’d also be interested in how this could be done.

You mentioned the dependency on the commit frequency. I’m using https://github.com/quantifind/KafkaOffsetMonitor.
With the 08 Kafka consumer a job's offsets as shown in the diagrams updated a lot more regularly
than the checkpointing interval. With the 10 consumer a commit is only made after a successful
checkpoint (or so it seems).

Why is that so? The checkpoint contains the Kafka offset and would be able to start reading
wherever it left off, regardless of any offset stored in Kafka or Zookeeper. Why is the offset
not committed regularly, independently from the checkpointing? Or did I misconfigure anything?


> Am 17.03.2017 um 10:26 schrieb Gyula Fóra <gyfora@apache.org>:
> Hi All,
> I am wondering if anyone has some nice suggestions on what would be the simplest/best
way of telling if a job is caught up with the Kafka input.
> An alternative question would be how to tell if a job is caught up to another job reading
from the same topic.
> The first thing that comes to my mind is looking at the offsets Flink commits to Kafka.
However this will only work if every job uses a different group id and even then it is not
very reliable depending on the commit frequency.
> The use case I am trying to solve is fault tolerant update of a job, by taking a savepoint
for job1 starting job2 from the savepoint, waiting until it catches up and then killing job1.
> Thanks for your input!
> Gyula

View raw message