kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-466: Add support for List<T> serialization and deserialization
Date Thu, 20 Jun 2019 21:45:31 GMT
For encoding the list-type: I see John's point about re-encoding the
list-type redundantly. However, I also don't like the idea that the
Deserializer returns a fixed type...

Maybe it's best allow users to specify the target list type on
deserialization via config?

Similar for the primitive types: I don't think we need to encode the
type size, but users could specify the type on the deserializer (via a
config again)?


About generics: nesting could be arbitrarily deep. Hence, I doubt we can
support this and a cast will be necessary at some point in the user code.



-Matthias



On 6/20/19 1:21 PM, John Roesler wrote:
> Hey Daniyar,
> 
> Thanks for looking at it!
> 
> Something like your screenshot is more along the lines of what I was
> thinking. Sorry, but I didn't follow what you mean, how would that not
> be "vanilla java"?
> 
> Unfortunately the deserializer needs more information, though. For
> example, what if the inner type is a Map<String,String>? The serde could
> only be used to produce a LinkedList<Map>, thus, we'd still need an
> inner serde, like you have in the KIP (Serde<T> innerSerde).
> 
> Something more like Serde<LinkedList<MyRecord>> = Serdes.listSerde(
>   /**list type**/ LinkedList.class,
>   /**inner serde**/ new MyRecordSerde()
> )
> 
> And in configuration, it's something like:
> default.key.serde: org...ListSerde
> default.key.list.serde.type: java.util.LinkedList
> default.key.list.serde.inner: com.mycompany.MyRecordSerde
> 
> 
> What do you think?
> Thanks,
> -John
> 
> On Thu, Jun 20, 2019 at 2:46 PM Development <dev@yeralin.net
> <mailto:dev@yeralin.net>> wrote:
> 
>     Hey John,
> 
>     I gave read about TypeReference. It could work for the list serde.
>     However, it is not directly
>     supported: https://github.com/FasterXML/jackson-databind/issues/1490
>     The only way is to pass an actual class object into the constructor,
>     something like:
> 
>     It could be an option, but not a pretty one. What do you think of my
>     approach to use vanilla java and canonical class name? (As described
>     previously)
> 
>     Best,
>     Daniyar Yeralin
> 
>>     On Jun 20, 2019, at 2:45 PM, Development <dev@yeralin.net
>>     <mailto:dev@yeralin.net>> wrote:
>>
>>     Hi John,
>>
>>     Thank you for your input! Yes, my idea looks a little bit over
>>     engineered :)
>>
>>     I also wanted to see a feedback from Mathias as well since he gave
>>     me an idea about storing fixed/variable size entries.
>>
>>     Best,
>>     Daniyar Yeralin
>>
>>>     On Jun 18, 2019, at 6:06 PM, John Roesler <john@confluent.io
>>>     <mailto:john@confluent.io>> wrote:
>>>
>>>     Hi Daniyar,
>>>
>>>     That's a very clever solution!
>>>
>>>     One observation is that, now, this is what we might call a
>>>     polymorphic
>>>     serde. That is, you're detecting the actual concrete type and then
>>>     promising to produce the exact same concrete type on read. There are
>>>     some inherent problems with this approach, which in general require
>>>     some kind of  schema registry (not necessarily Schema Registry, just
>>>     any registry for schemas) to solve.
>>>
>>>     Notice that every serialized record has quite a bit of duplicated
>>>     information: the concrete type as well as a byte to indicate whether
>>>     the value type is a fixed size, and, if so, an integer to
>>>     indicate the
>>>     actual size. These constitute a schema, of sorts, because they
>>>     tell us
>>>     later how exactly to deserialize the data. Unfortunately, this
>>>     information is completely redundant. In all likelihood, the
>>>     information will be exactly the same for every record in the topic.
>>>     This problem is essentially the core motivation for serializations
>>>     like Avro: to move the schema outside of the serialization itself, so
>>>     that the records won't contain so much redundant information.
>>>
>>>     In this light, I'm wondering if it makes sense to go back to
>>>     something
>>>     like what you had earlier in which you don't support perfectly
>>>     preserving the concrete type for _this_ serde, but instead just
>>>     support deserializing to _some_ List. Then, you could defer full,
>>>     perfect, type preservation to serdes that have an external system in
>>>     which to register their type information.
>>>
>>>     There does exist an alternative, if we really do want to preserve the
>>>     concrete type (which does seem kind of nice). You can add a
>>>     configuration option specifically for the serde to configure what the
>>>     list type will be, and maybe what the element type is, as well.
>>>
>>>     As far as "related work" goes, you might be interested to take a look
>>>     at how Jackson can be configured to deserialize into a specific,
>>>     arbitrarily nested, generically parameterized class structure.
>>>     Specifically, you might find
>>>     https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.html
>>>     interesting.
>>>
>>>     Thanks,
>>>     -John
>>>
>>>     On Mon, Jun 17, 2019 at 12:38 PM Development <dev@yeralin.net
>>>     <mailto:dev@yeralin.net>> wrote:
>>>>
>>>>     bump
>>
> 


Mime
View raw message