flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vijay Srinivasaraghavan <vijikar...@yahoo.com>
Subject Checkpoint
Date Mon, 07 Mar 2016 19:42:30 GMT
I am writing a small application to understand how checkpoint and recovery works in Flink.
Here is my setup.
Flink 3 node cluster (1 JM, 2 TM) built from latest github codebaseHDP 2.4 deployed with just
HDFS for checkpoint and sinkKafka 0.9x
The sample code pulls Kafka topic (1 partition) and do some transformation, sinks it to HDFS
(RollingSink). For now, I am creating sink file for every 1KB size and checkpoint is trigerred
every second.
The message pumped in to Kafka are just sequence numbers 1, 2, 3 .... N
Below are my observations from the setup.
1) Flink uses checkpoint location from HDFS to maintain the checkpoint information2) Sink
is working properly under normal run (read no TM failures)
Questions:1) How do I find the Kafka topic/partition offset details that Flink mainatins in
checkpoint (readable format)
2) When I manually simulate TM failures, I sometime see data duplicate data. I was expecting
exactly once mechanism to work but found some duplicates. How do I validate exactly once
is working fine or not?
3) How can I simulate and verify backpressure? I have introduced some delay (Thread Sleep)
in the job before the sink but the "backpressure" tab from UI does not show any indication
of whether backpressure is working or not.
Appreciate your thoughts.
View raw message