flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Fink: KafkaProducer Data Loss
Date Mon, 31 Jul 2017 09:41:15 GMT

Thanks a lot for providing this.
I'll try to find some time this week to look into this using your example code.


On 29 July 2017 at 4:46:57 AM, ninad (nninad@gmail.com) wrote:

Hi Gordon, I was able to reproduce the data loss on standalone flink cluster 
also. I have stripped down version of our code with here: 

Flink standalone 1.3.0 
Kafka 0.9 

*What the code is doing:* 
-consume messages from kafka topic ('event.filter.topic' property in 
-group them by key 
-analyze the events in a window and filter some messages. 
-send remaining messages to kafka topc ('sep.http.topic' property in 

To build: 
./gradlew clean assemble 

The jar needs path to 'application.properties' file to run 

Important properties in application.properties: 
event.filter.topic --> source topic 
sep.http.topic --> destination topic 

To test: 
-Use 'EventGenerator' class to publish messages to source kafka topic 
The data published won't be filtered by the logic. If you publish 10 
messages to the source topic, 
those 10 messages should be sent to the destination topic. 

-Once we see that flink has received all the messages, bring down all kafka 

-Let Flink jobs fail for 2-3 times. 

-Restart kafka brokers. 

Note: Data loss isn't observed frequently. 1/4 times or so. 

Thanks for all your help. 


View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Fink-KafkaProducer-Data-Loss-tp11413p14522.html

Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

View raw message