flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Björn Hedström <bjorn.e.hedst...@gmail.com>
Subject CEP with Kafka source
Date Fri, 04 Aug 2017 07:40:18 GMT

I am writing a small application which monitors a couple of directories for
files which are read by Kafka and later consumed by Flink. Flink then
performs some operations on the records (such as extracting the embedded
timestamp) and tries to find a pattern using CEP. Since the data can be out
of order I am using a BoundedOutOfOrdernessTimestampExtractor with the
window allowing for elements to come up to 24 hours late. The
TimeCharacteristic is set to EventTime.

However here is where i run into some issues. I noticed that Flink does not
start to process the data through the defined pattern until the watermark
is greater than the  timestamp of the record. This issue does not appear
when using a text-file directly as a source and disregarding Kafka. In
practice this could mean that a pattern only consisting of two consecutive
datapoints would not be found until another subsequent 22 datapoints are
collected. It seems that I am missing something fundamental here and any
help would be appreciated

I am using a FlinkKafkaConsumer010, Flink 1.3.0, Kafka


View raw message