flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philip Doctor <philip.doc...@physiq.com>
Subject Flink Kafka reads too many bytes .... Very rarely
Date Mon, 26 Feb 2018 23:02:36 GMT
I’m using Flink 1.4.0 with FlinkKafkaConsumer010 and have been for almost a year.  Recently,
I started getting messages of the wrong length in Flink causing my deserializer to fail. 
Let me share what I’ve learned:

  1.  All of my messages are 520 bytes exactly when my producer places them in kafka
  2.  About 1% of these messages have this deserialization issue in flink
  3.  When it happens, I read 10104 bytes in flink
  4.  When I write the bytes my producer creates to a file on disk (rather than kafka) my
code reads 520 bytes and consumes them without issue off of disk
  5.  When I use kafka tool (http://www.kafkatool.com/index.html)  to dump the contents of
my topic to disk, and read each message one at a time off of disk, my code reads 520 bytes
per message and consumes them without issue
  6.  When I write a simple Kafka consumer (not using flink) I read one message at a time
it’s 520 bytes and my code runs without issue

#5 and #6 are what lead me to believe that this issue is squarely a problem with Flink.

However, it gets more complicated, I took the messages I wrote out with both my simple consumer
and the kafka tool, and I load them into a local kafka server, then attach a local flink cluster
and I cannot reproduce the error, yet I can reproduce it 100% of the time in something closer
to my production environment.

I realize this latter sounds suspicious, but I have not found anything in the Kafka docs indicating
that I might have a configuration issue here, yet my simple local setup that would allow me
to iterate on this and debug has failed me.

I’m really quite at a loss here, I believe there’s a Flink Kafka consumer bug, it happens
exceedingly rarely as I went a year without seeing it.  I can reproduce it in an expensive
environment but not in a “cheap” environment.

Thank you for your time, I can provide my sample data set in case that helps.  I dumped it
on my google drive https://drive.google.com/file/d/1h8jpAFdkSolMrT8n47JJdS6x21nd_b7n/view?usp=sharing
that’s the full data set, about 1% of it ends up failing, it’s really hard to figure out
which message since I can’t read any of the message that I receive and I get data out of

View raw message