flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Why use Kafka after all?
Date Wed, 23 Nov 2016 04:50:49 GMT
Hi Matt,

Just to be clear, what I'm looking for is a way to serialize a POJO class for Kafka but also
for Flink, I'm not sure the interface of both frameworks are compatible but it seems they
aren't.

For Kafka (producer) I need a Serializer and a Deserializer class, and for Flink (consumer)
a SerializationSchema and DeserializationSchema class.

Any example of how to put this together would be greatly appreciated.

There’s actually a related JIRA for this: https://issues.apache.org/jira/browse/FLINK-4050.
The corresponding PR is https://github.com/apache/flink/pull/2705, which adds wrappers for
the Kafka serializers.
Is this feature what you’re probably looking for?

Best Regards,
Gordon


On November 18, 2016 at 12:11:23 PM, Matt (dromitlabs@gmail.com) wrote:

Just to be clear, what I'm looking for is a way to serialize a POJO class for Kafka but also
for Flink, I'm not sure the interface of both frameworks are compatible but it seems they
aren't.

For Kafka (producer) I need a Serializer and a Deserializer class, and for Flink (consumer)
a SerializationSchema and DeserializationSchema class.

Any example of how to put this together would be greatly appreciated.

On Thu, Nov 17, 2016 at 9:12 PM, Dromit <dromitlabs@gmail.com> wrote:
Tzu-Li Tai, thanks for your response.

I've seen the example you mentioned before, TaxiRideSchema.java, but it's way too simplified.

In a real POJO class you may have multiple fields such as integers, strings, doubles, etc.
So serializing them as a string like in the example wouldn't work (you can't put together
two arbitrary strings and later split the byte array to get each of them, same for two integers,
and nearly any other types).

I feel there should be a more general way of doing this regardless of the fields on the class
you're de/serializing.

What do you do in these cases? It should be a pretty common scenario!

Regards,
Matt

On Wed, Nov 16, 2016 at 2:01 PM, Philipp Bussche <philipp.bussche@gmail.com> wrote:
Hi Dromit

I started using Flink with Kafka but am currently looking into Kinesis to
replace Kafka.
The reason behind this is that eventually my application will run in
somebody's cloud and if I go for AWS then I don't have to take care of
operating Kafka and Zookeeper myself. I understand this can be a challenging
task.
Up to know where the Kafka bit is only running in a local test environment I
am happy running it as I just start 2 Docker containers and it does the job.
But this also means I have no clue how Kafka really works and what I need to
be careful with.
Besides knowledge which is required as it seems for Kafka costs is another
aspect here.
If one wants to operate a Kafka cluster plus Zookeeper on let's say the
Amazon cloud this might actually be more expensive than "just" using Kinesis
as a service.
There are apparently draw backs in terms of functionality and performance
but for my use case that does not seem to matter.

Philipp



--
View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Why-use-Kafka-after-all-tp10112p10155.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.



Mime
View raw message