flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Does Kafka connector leverage Kafka message keys?
Date Sun, 10 Apr 2016 12:22:35 GMT
Hi!

You are right with your observations. Right now, you would have to create a
"Tuple2<Key, Value>" in the KeyedDeserializationSchema. That is what also a
KeyedStream holds internally.

A KeyedStream in Flink is more than just a stream that has a Key and a
Value - it is also partitioned by the key, and Flink maintains track of
keyed state in those streams. That's why it has to be explicitly created.

For convenience, one could make an addition that FlinkKafkaConsumer can
accept two DeserializationSchema (one for key, one for value) and return a
Tuple2<Key, Value> automatically.

Greetings,
Stephan


On Sun, Apr 10, 2016 at 5:49 AM, Elias Levy <fearsome.lucidity@gmail.com>
wrote:

> I am wondering if the Kafka connectors leverage Kafka message keys at all?
>
> Looking through the docs my impression is that it does not.  E.g. if I use
> the connector to consume from a partitioned Kafka topic, what I will get
> back is a DataStream, rather than a KeyedStream.  And if I want access to a
> message's key the key must be within the message to extract it or I have to
> make use of a KeyedDeserializationSchema with the connector to access the
> Kafka message key and insert it into the type returned by the connector.
>
> Similar, it would seem that you have give the Kafka product sink a
> KeyedSerializationSchema, which will obtain a Kafka key and a Kafka message
> from the events from a DataStream, but you can product from a KeyedStream
> where the key if obtained from the stream itself.
>
> Is this correct?
>
>

Mime
View raw message