flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dromit <dromitl...@gmail.com>
Subject Why use Kafka after all?
Date Tue, 15 Nov 2016 07:14:46 GMT

As far as I've seen, there are a lot of projects using Flink and Kafka
together, but I'm not seeing the point of that. Let me know what you think
about this.

1. If I'm not wrong, Kafka provides basically two things: storage (records
retention) and fault tolerance in case of failure, while Flink mostly cares
about the transformation of such records. That means I can write a pipeline
with Flink alone, and even distribute it on a cluster, but in case of
failure some records may be lost, or I won't be able to reprocess the data
if I change the code, since the records are not kept in Flink by default
(only when sinked properly). Is that right?

2. In my use case the records come from a WebSocket and I create a custom
class based on messages on that socket. Should I put those records inside a
Kafka topic right away using a Flink custom source (SourceFunction) with a
Kafka sink (FlinkKafkaProducer), and independently create a Kafka source
(KafkaConsumer) for that topic and pipe the Flink transformations there? Is
that data flow fine?

Basically what I'm trying to understand with both question is how and why
people are using Flink and Kafka.


View raw message