flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Record Delivery Guarantee with Kafka 1.0.0
Date Wed, 14 Mar 2018 20:34:11 GMT
Hi,

How are you checking that records are missing? Flink should flush to Kafka and wait for all
records to be flushed when performing a checkpoint.

Best,
Aljoscha

> On 13. Mar 2018, at 21:31, Chirag Dewan <chirag.dewan22@yahoo.in> wrote:
> 
> Hi,
> 
> Still stuck around this. 
> 
> My understanding is, this is something Flink can't handle. If the batch-size of Kafka
Producer is non zero(which ideally should be), there will be in-memory records and data loss(boundary
cases). Only way I can handle this with Flink is my checkpointing interval, which flushes
any buffered records. 
> 
> Is my understanding correct here? Or am I still missing something?  
> 
> thanks,
> 
> Chirag  
> 
> On Monday, 12 March, 2018, 12:59:51 PM IST, Chirag Dewan <chirag.dewan22@yahoo.in>
wrote:
> 
> 
> Hi,
> 
> I am trying to use Kafka Sink 0.11 with ATLEAST_ONCE semantic and experiencing some data
loss on Task Manager failure.
> 
> Its a simple job with parallelism=1 and a single Task Manager. After a few checkpoints(kafka
flush's) i kill one of my Task Manager running as a container on Docker Swarm. 
> 
> I observe a small number of records, usually 4-5, being lost on Kafka broker(1 broker
cluster, 1 topic with 1 partition).
> 
> My FlinkKafkaProducer config are as follows : 
> 
> batch.size=default(16384)
> retries=3
> max.in.flight.requests.per.connection=1
> acks=1
> 
> As I understand it, all the messages batched by KafkaProducer(RecordAccumulator) in the
memory-buffer, are lost. Is this why I cant see my records on the broker? Or is there something
I am doing terribly wrong? Any help appreciated.
> 
> TIA,
> 
> Chirag
> 
>  
>  


Mime
View raw message