flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gaël Renoux <gael.ren...@datadome.co>
Subject Finite source without blocking save-points
Date Mon, 04 Nov 2019 14:50:02 GMT
Hello everyone,

I have a job which runs continuously, but it also needs to send a single
specific Kafka message on startup. I tried the obvious approach to use
StreamExecutionEnvironment.fromElements and add a Kafka sink, however
that's not possible: the source being finished, it becomes impossible to
stop the job with a save-point later.

The best solution I found is creating a basic Kafka producer to send the
message, and running that producer inside the job's startup script (before
calling StreamExecutionEnvironment.execute()). However, there's a race
condition, where the message could be sent and trigger stuff before the job
is ready to receive messages. In addition, it forces me to have a separate
Kafka producer, while Flink already comes with Kafka sinks. And finally,
it's pretty specific to my use case (sending a Kafka message), and it looks
like there should be a generic solution here.

Do you guys know of any better way to do this? Is there any way to set up a
finite source that will not block save-points?

Just in case, the global use case is nothing special: my job maintains a
set of rules as broadcast state in operators and handle input according to
those rules. On startup, I need to request all rules to be sent at once
(the emitter normally sends updated rules only), in case the rule state has
been lost (happens when we evolve the rule model, for instance), and this
is done through a Kafka message.

Thanks in advance!

Gaël Renoux

View raw message