flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tzu-Li (Gordon) Tai" <tzuli...@apache.org>
Subject Re: Serializers and Schemas
Date Thu, 08 Dec 2016 07:15:05 GMT
Hi Matt,

1. There’s some in-progress work on wrapper util classes for Kafka de/serializers here [1]
that allows
Kafka de/serializers to be used with the Flink Kafka Consumers/Producers with minimal user
The PR also has some proposed adds to the documentations for the wrappers.

2. I feel that it would be good to have more documentation on Flink’s de/serializers because
they’ve been
frequently asked about on the mailing lists, but at the same time, probably the fastest /
efficient de/serialization
approach would be tailored for each use case, so we’d need to think more on the presentation
and the purpose
of the documentation.


[1] https://github.com/apache/flink/pull/2705

On December 8, 2016 at 5:00:19 AM, milind parikh (milindsparikh@gmail.com) wrote:

Why not use a self-describing format  (json), stream as String and read through a json reader
and avoid top-level reflection?




Apologies if I misunderstood the question. But I can quite see how to model your Product class
(or indeed POJO) in a fairly generic way ( assumes JSON).

The real issues faced when you have different versions of same POJO class requires storing
enough information to dynamically instantiate the actual version of the class; which I believe
is beyond the simple use case.


On Dec 7, 2016 2:42 PM, "Matt" <dromitlabs@gmail.com> wrote:
I've read your example, but I've found the same problem. You're serializing your POJO as a
string, where all fields are separated by "\t". This may work for you, but not in general.


I would like to see a more "generic" approach for the class Product in my last message. I
believe a more general purpose de/serializer for POJOs should be possible to achieve using

On Wed, Dec 7, 2016 at 1:16 PM, Luigi Selmi <luigiselmi@gmail.com> wrote:
Hi Matt,

I had the same problem, trying to read some records in event time using a POJO, doing some
transformation and save the result into Kafka for further processing. I am not yet done but
maybe the code I wrote starting from the Flink Forward 2016 training docs could be useful.




On 7 December 2016 at 16:35, Matt <dromitlabs@gmail.com> wrote:

I don't quite understand how to integrate Kafka and Flink, after a lot of thoughts and hours
of reading I feel I'm still missing something important.

So far I haven't found a non-trivial but simple example of a stream of a custom class (POJO).
It would be good to have such an example in Flink docs, I can think of many many scenarios
in which using SimpleStringSchema is not an option, but all Kafka+Flink guides insist on using

Maybe we can add a simple example to the documentation [1], it would be really helpful for
many of us. Also, explaining how to create a Flink De/SerializationSchema from a Kafka De/Serializer
would be really useful and would save a lot of time to a lot of people, it's not clear why
you need both of them or if you need both of them.

As far as I know Avro is a common choice for serialization, but I've read Kryo's performance
is much better (true?). I guess though that the fastest serialization approach is writing
your own de/serializer.

1. What do you think about adding some thoughts on this to the documentation?
2. Can anyone provide an example for the following class?

public class Product {
    public String code;
    public double price;
    public String description;
    public long created;


[1] http://data-artisans.com/kafka-flink-a-practical-how-to/

Luigi Selmi, M.Sc.
Fraunhofer IAIS Schloss Birlinghoven . 
53757 Sankt Augustin, Germany
Phone: +49 2241 14-2440

View raw message