From dev-return-105561-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Tue Jul 9 21:57:00 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 606FC18062B for ; Tue, 9 Jul 2019 23:57:00 +0200 (CEST) Received: (qmail 52539 invoked by uid 500); 9 Jul 2019 21:56:57 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 52525 invoked by uid 99); 9 Jul 2019 21:56:57 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Jul 2019 21:56:56 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 3A9431A34B1 for ; Tue, 9 Jul 2019 21:56:56 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.296 X-Spam-Level: **** X-Spam-Status: No, score=4.296 tagged_above=-999 required=6.31 tests=[DC_PNG_UNO_LARGO=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=2, KAM_LINEPADDING=1.2, PDS_NO_HELO_DNS=1.294, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=confluent.io Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 9H4qHl5lZ_EN for ; Tue, 9 Jul 2019 21:56:53 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=2a00:1450:4864:20::332; helo=mail-wm1-x332.google.com; envelope-from=john@confluent.io; receiver= Received: from mail-wm1-x332.google.com (mail-wm1-x332.google.com [IPv6:2a00:1450:4864:20::332]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with ESMTPS id D7ACD7E209 for ; Tue, 9 Jul 2019 21:56:52 +0000 (UTC) Received: by mail-wm1-x332.google.com with SMTP id s15so281777wmj.3 for ; Tue, 09 Jul 2019 14:56:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=confluent.io; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=Q4gf7UydlsJayotglwN59+otMa1w+XI5zaF0fLZAZIc=; b=EWURONJyFscrlhh9lyHHCI84zKisz4DtkfBqVI+ev9tKw223BNXgx6EDD3rdEzhdFc 2WivGv1K+9OqN7aRqlx6CvB/nUOIlylfeO8Ef5CkSI01PdNSvFf5V8KSqJb4Ho8ACx9t yGHv/2X2kBHieNxWjrPdng6hi9WynUc6welG+pY3Y7WXrtQ/cLvLU4XMPqqPxYUYlKOL GSYqB4z7tWj2ZbhLsS21qSodasgqJhzY5jZoyB/+nDHt30ULy/3AFW2GnFjBc2ym5zwx g44/aAwqMbk5nRrcta9MFKKqq/BNtyuh8oLiEge4UDO2F90bbtcOwUPNZFfvLehxORcZ xCOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=Q4gf7UydlsJayotglwN59+otMa1w+XI5zaF0fLZAZIc=; b=adD2iQ10hiWQ8dpwZ4OFrzoWnP6zBeODNCUGqX5aS5ssU0owoU1twTmr0yPbJKfz2V XQygjEMVxOmkr+O43atZeo2mu4TM7aO9/3q+q6DNB/mzjyB3xXECABx3UQJEf2Zm67pL p5eEnwA9JGrSyHtJnPECF/51ksW+wFO9uYpSguF2u+c4KhpBJIHVLRLR9enbC5PldR9N GLxqy1SxHiqiQk2rCoVzs/4vZlFcGZdKEobPPymi/HiprWCWjlrMvtQgvoleCFRj2jx1 gI2R2wGl8luwZm4nzv1gIOWkNc8IZCmlgR/17CnzsUrQ+aPF4k+shgt++Jy2z+9jPAa4 Aa6A== X-Gm-Message-State: APjAAAVTkNcDq6lYv60jBDc7lS4Omxle+K2OGWMmMod37jUBV6jZZqBn kC653TjEWMqAOvQ7d5+oRRuq+DQrdYYnhyRVJdDDPO1H4t0= X-Google-Smtp-Source: APXvYqz+hku2McWgtBaGzbE0PKx0MwLZNRS2k3feJA/ZQlQnQfX/UPkWEKo5zQkN/cYJV3NOhOd+yyPQ7hH1wFmRAEs= X-Received: by 2002:a05:600c:23d2:: with SMTP id p18mr1499172wmb.108.1562709411896; Tue, 09 Jul 2019 14:56:51 -0700 (PDT) MIME-Version: 1.0 References: <30BEDED3-9E18-4588-85F3-F4632B5E99D2@yeralin.net> <3CDD50E3-6185-4EED-9669-7E3A4A5B0413@yeralin.net> <71962ABA-2B94-4BD0-9368-E4D262DF0678@yeralin.net> <964309CB-E22A-4CEE-9EAF-6E8BA5F76CB0@yeralin.net> <85DF2725-56F6-48DB-83F7-1E46E42113E4@yeralin.net> <16DA0184-D6CF-4F57-A10A-48793ED2E8CA@yeralin.net> <71f01b12-d6c9-fe0c-2953-59ee8f514100@confluent.io> <77237200-de31-0ed0-6cf8-0b2c288f04c7@confluent.io> <6CCDBC0C-583F-4236-8630-89CC4959A81A@yeralin.net> <689BE80E-6CDF-446F-88F1-15C4604B2398@yeralin.net> In-Reply-To: <689BE80E-6CDF-446F-88F1-15C4604B2398@yeralin.net> From: John Roesler Date: Tue, 9 Jul 2019 16:56:40 -0500 Message-ID: Subject: Re: [DISCUSS] KIP-466: Add support for List serialization and deserialization To: dev@kafka.apache.org Content-Type: multipart/related; boundary="000000000000628e19058d46a37c" --000000000000628e19058d46a37c Content-Type: multipart/alternative; boundary="000000000000628e16058d46a37b" --000000000000628e16058d46a37b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Ah, my apologies, I must have just overlooked it. Thanks for the update, too. Just one more super-small question, do we need this variant: > New method public static Serde> ListSerde() in org.apache.kafka.common.serialization.Serdes class (infers list implementation and inner serde from config file) It seems like this situation implies my config file is already set up for the list serde, so passing this serde (e.g., in Produced) would have the same effect as not specifying it. I guess that it could be the case that you have the `default.key/value.serde` set to something else, like StringSerde, but you still have the `default.key/value.list.serde.impl/element` set. This seems like it would result in more confusion than convenience, so my gut instinct is maybe we shouldn't introduce the `ListSerde()` variant until people actually request it later on. Thus, we'd just stick with fully config-driven or fully source-code-driven, not half/half. What do you think? Thanks, -John On Tue, Jul 9, 2019 at 9:58 AM Development wrote: > > Hi John, > > I hope everyone had a great long weekend. > > Regarding Java interfaces, I may not understand you correctly, but I think I already listed them: > > So for Produced, you would use it in the following fashion, for example: Produced.keySerde(Serdes.ListSerde(ArrayList.class, Serdes.Integer())) > > I also updated the KIP, and added a section =E2=80=9CSerialization Strate= gy=E2=80=9D where I describe our logic of conditional serialization based on the type of an inner serde. > > Thank you! > > Best, > Daniyar Yeralin > > On Jun 26, 2019, at 11:44 AM, John Roesler wrote: > > Thanks for the update, Daniyar! > > In addition to specifying the config interface, can you also specify > the Java interface? Namely, if I need to pass an instance of this > serde in to the DSL directly, as in Produced, Materialized, etc., what > constructor(s) would I have available? Likewise with the Serializer > and Deserailizer. I don't think you need to specify the implementation > logic, since we've already discussed it here. > > If you also want to specify the serialized format of the data records > in the KIP, it could be useful documentation, as well as letting us > verify the schema for forward/backward compatibility concerns, etc. > > Thanks, > John > > On Wed, Jun 26, 2019 at 10:33 AM Development wrote: > > > Hey, > > Finally made updates to the KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-466%3A+Add+support+fo= r+List%3CT%3E+serialization+and+deserialization < https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+support+for+= List%3CT%3E+serialization+and+deserialization > > Sorry for the delay :) > > Thank You! > > Best, > Daniyar Yeralin > > On Jun 22, 2019, at 12:49 AM, Matthias J. Sax wrote: > > Yes, something like this. I did not think about good configuration > parameter names yet. I am also not sure if I understand all proposed > configs atm. But all configs should be listed and explained in the KIP > anyway, and we can discuss further after you have updated the KIP (I can > ask more detailed question if I have any). > > > -Matthias > > On 6/21/19 2:05 PM, Development wrote: > > Yes, you are right. ByteSerializer is not what I need to have in a list > of primitives. > > As for the default constructor and configurability, just want to make > sure. Is this what you have on your mind? > > Best, > Daniyar Yeralin > > > > On Jun 21, 2019, at 2:51 PM, Matthias J. Sax > wrote: > > Thanks for the update! > > I think that `ListDeserializer`, `ListSerializer`, and `ListSerde` > should have an default constructor and it should be possible to pass in > the `Class listClass` information via a configuration. Otherwise, > KafkaStreams cannot use it as default serde. > > > For the primitive serializers: `BytesSerializer` is not primitive IMHO, > as is it for `byte[]` with variable length -- it's for arrays, not for > single `byte` (note, that `Bytes` is a Kafka class wrapping `byte[]`). > > > For tests, we can comment on the PR. No need to do this in the KIP > discussion. > > > Can you also update the KIP? > > > > -Matthias > > > > > > On 6/21/19 11:29 AM, Development wrote: > > I made and pushed necessary commits, so we could review the final > version under PR https://github.com/apache/kafka/pull/6592 > > I also need some advice on writing tests for this new serde. So far I > only have two test cases (roundtrip and empty payload), I=E2=80=99m not s= ure > if it is enough. > > Thank y=E2=80=99all for your help in this KIP :) > > Best, > Daniyar Yeralin > > > On Jun 21, 2019, at 1:44 PM, John Roesler > wrote: > > Hey Daniyar, > > Looks good to me! Thanks for considering it. > > Thanks, > -John > > On Fri, Jun 21, 2019 at 9:04 AM Development > wrote: > Hey John and Matthias, > > Yes, now I see it all. I=E2=80=99m storing lots of redundant information. > Here is my final idea. Yes, now a user should pass a list type. I > realized that=E2=80=99s the type is not really needed in ListSerializer, = but > only in ListDeserializer: > > > In ListSerializer we will start storing sizes only if serializer is > not a primitive serializer: > > > Then, in deserializer, we persist passed list type, so that during > deserialization we could create an instance of it with predefined > listSize for better performance. > We also try to locate a primitiveSize based on passed deserializer. > If it is not there, then primitiveSize will be null. Which means > that each entry=E2=80=99s size was encoded individually. > > > This looks much cleaner and more concise. > > What do you think? > > Best, > Daniyar Yeralin > > On Jun 20, 2019, at 5:45 PM, Matthias J. Sax > wrote: > > For encoding the list-type: I see John's point about re-encoding the > list-type redundantly. However, I also don't like the idea that the > Deserializer returns a fixed type... > > Maybe it's best allow users to specify the target list type on > deserialization via config? > > Similar for the primitive types: I don't think we need to encode the > type size, but users could specify the type on the deserializer (via a > config again)? > > > About generics: nesting could be arbitrarily deep. Hence, I doubt > we can > support this and a cast will be necessary at some point in the user > code. > > > > -Matthias > > > > On 6/20/19 1:21 PM, John Roesler wrote: > > Hey Daniyar, > > Thanks for looking at it! > > Something like your screenshot is more along the lines of what I was > thinking. Sorry, but I didn't follow what you mean, how would that not > be "vanilla java"? > > Unfortunately the deserializer needs more information, though. For > example, what if the inner type is a Map? The serde > could > only be used to produce a LinkedList, thus, we'd still need an > inner serde, like you have in the KIP (Serde innerSerde). > > Something more like Serde> =3D Serdes.listSerde( > /**list type**/ LinkedList.class, > /**inner serde**/ new MyRecordSerde() > ) > > And in configuration, it's something like: > default.key.serde: org...ListSerde > default.key.list.serde.type: java.util.LinkedList > default.key.list.serde.inner: com.mycompany.MyRecordSerde > > > What do you think? > Thanks, > -John > > On Thu, Jun 20, 2019 at 2:46 PM Development > >> wrote: > > Hey John, > > I gave read about TypeReference. It could work for the list serde. > However, it is not directly > supported: > https://github.com/FasterXML/jackson-databind/issues/1490 > > The only way is to pass an actual class object into the constructor, > something like: > > It could be an option, but not a pretty one. What do you think of my > approach to use vanilla java and canonical class name? (As described > previously) > > Best, > Daniyar Yeralin > > On Jun 20, 2019, at 2:45 PM, Development > >> wrote: > > Hi John, > > Thank you for your input! Yes, my idea looks a little bit over > engineered :) > > I also wanted to see a feedback from Mathias as well since he gave > me an idea about storing fixed/variable size entries. > > Best, > Daniyar Yeralin > > On Jun 18, 2019, at 6:06 PM, John Roesler > >> wrote: > > Hi Daniyar, > > That's a very clever solution! > > One observation is that, now, this is what we might call a > polymorphic > serde. That is, you're detecting the actual concrete type and then > promising to produce the exact same concrete type on read. > There are > some inherent problems with this approach, which in general > require > some kind of schema registry (not necessarily Schema > Registry, just > any registry for schemas) to solve. > > Notice that every serialized record has quite a bit of duplicated > information: the concrete type as well as a byte to indicate > whether > the value type is a fixed size, and, if so, an integer to > indicate the > actual size. These constitute a schema, of sorts, because they > tell us > later how exactly to deserialize the data. Unfortunately, this > information is completely redundant. In all likelihood, the > information will be exactly the same for every record in the > topic. > This problem is essentially the core motivation for serializations > like Avro: to move the schema outside of the serialization > itself, so > that the records won't contain so much redundant information. > > In this light, I'm wondering if it makes sense to go back to > something > like what you had earlier in which you don't support perfectly > preserving the concrete type for _this_ serde, but instead just > support deserializing to _some_ List. Then, you could defer full, > perfect, type preservation to serdes that have an external > system in > which to register their type information. > > There does exist an alternative, if we really do want to > preserve the > concrete type (which does seem kind of nice). You can add a > configuration option specifically for the serde to configure > what the > list type will be, and maybe what the element type is, as well. > > As far as "related work" goes, you might be interested to take > a look > at how Jackson can be configured to deserialize into a specific, > arbitrarily nested, generically parameterized class structure. > Specifically, you might find > https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackso= n/core/type/TypeReference.html > < https://fasterxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackso= n/core/type/TypeReference.html > > interesting. > > Thanks, > -John > > On Mon, Jun 17, 2019 at 12:38 PM Development > >> wrote: > > > bump > > > > > > > > --000000000000628e16058d46a37b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Ah, my apologies, I must have just overlooked it. Thanks f= or the update, too.

Just one more super-small question, do we need t= his variant:

> New method public static <T> Serde<List&= lt;T>> ListSerde() in org.apache.kafka.common.serialization.Serdes cl= ass (infers list implementation and inner serde from config file)

<= /div>
It seems like this situation implies my config file is already se= t up for the list serde, so passing this serde (e.g., in Produced) would ha= ve the same effect as not specifying it.=C2=A0

I g= uess that it could be the case that you have the `default.key/value.serde` = set to something else, like StringSerde, but you still have the `default.ke= y/value.list.serde.impl/element` set. This seems like it would result in mo= re confusion than convenience, so my gut instinct is maybe we shouldn't= introduce the `ListSerde()` variant until people actually request it later= on.

Thus, we'd just stick with fully config-d= riven or fully source-code-driven, not half/half.

= What do you think?

Thanks,
-John

=
On Tue, Jul 9, 2019 at 9:58 AM Development <dev@yeralin.net> wrote:
>
> Hi John,
><= br>> I hope everyone had a great long weekend.
>
> Regarding= Java interfaces, I may not understand you correctly, but I think I already= listed them:
>
> So for Produced, you would use it in the foll= owing fashion, for example: Produced.keySerde(Serdes.ListSerde(ArrayList.cl= ass, Serdes.Integer()))
>
> I also updated the KIP, and added a= section =E2=80=9CSerialization Strategy=E2=80=9D where I describe our logi= c of conditional serialization based on the type of an inner serde.
>=
> Thank you!
>
> Best,
> Daniyar Yeralin
>> On Jun 26, 2019, at 11:44 AM, John Roesler <john@confluent.io> wrote:
>
> Thanks for= the update, Daniyar!
>
> In addition to specifying the config = interface, can you also specify
> the Java interface? Namely, if I ne= ed to pass an instance of this
> serde in to the DSL directly, as in = Produced, Materialized, etc., what
> constructor(s) would I have avai= lable? Likewise with the Serializer
> and Deserailizer. I don't t= hink you need to specify the implementation
> logic, since we've = already discussed it here.
>
> If you also want to specify the = serialized format of the data records
> in the KIP, it could be usefu= l documentation, as well as letting us
> verify the schema for forwar= d/backward compatibility concerns, etc.
>
> Thanks,
> Joh= n
>
> On Wed, Jun 26, 2019 at 10:33 AM Development <dev@yeralin.net> wrote:
>
><= br>> Hey,
>
> Finally made updates to the KIP: https://cwiki.apache.org/con= fluence/display/KAFKA/KIP-466%3A+Add+support+for+List%3CT%3E+serialization+= and+deserialization <https://cwiki.apache.org/confluence/display/KAFKA/KIP-466:+Add+s= upport+for+List%3CT%3E+serialization+and+deserialization>
> So= rry for the delay :)
>
> Thank You!
>
> Best,
&g= t; Daniyar Yeralin
>
> On Jun 22, 2019, at 12:49 AM, Matthias J= . Sax <matthias@confluent.io> wrote:
>
> Yes, something like this. I did not think abou= t good configuration
> parameter names yet. I am also not sure if I u= nderstand all proposed
> configs atm. But all configs should be liste= d and explained in the KIP
> anyway, and we can discuss further after= you have updated the KIP (I can
> ask more detailed question if I ha= ve any).
>
>
> -Matthias
>
> On 6/21/19 2:05 = PM, Development wrote:
>
> Yes, you are right. ByteSerializer i= s not what I need to have in a list
> of primitives.
>
> = As for the default constructor and configurability, just want to make
&g= t; sure. Is this what you have on your mind?
>
> Best,
> = Daniyar Yeralin
>
>
>
> On Jun 21, 2019, at 2:51 PM= , Matthias J. Sax <
matthias@con= fluent.io
> <mailto:m= atthias@confluent.io>> wrote:
>
> Thanks for the upda= te!
>
> I think that `ListDeserializer`, `ListSerializer`, and = `ListSerde`
> should have an default constructor and it should be pos= sible to pass in
> the `Class listClass` information via a configurat= ion. Otherwise,
> KafkaStreams cannot use it as default serde.
>= ;
>
> For the primitive serializers: `BytesSerializer` is not p= rimitive IMHO,
> as is it for `byte[]` with variable length -- it'= ;s for arrays, not for
> single `byte` (note, that `Bytes` is a Kafka= class wrapping `byte[]`).
>
>
> For tests, we can commen= t on the PR. No need to do this in the KIP
> discussion.
>
&= gt;
> Can you also update the KIP?
>
>
>
> -M= atthias
>
>
>
>
>
> On 6/21/19 11:29 AM= , Development wrote:
>
> I made and pushed necessary commits, s= o we could review the final
> version under PR https://github.com/apache/kafka/pull/6592
>
> I also need some advice on writing tests for this new se= rde. So far I
> only have two test cases (roundtrip and empty payload= ), I=E2=80=99m not sure
> if it is enough.
>
> Thank y=E2= =80=99all for your help in this KIP :)
>
> Best,
> Daniya= r Yeralin
>
>
> On Jun 21, 2019, at 1:44 PM, John Roesler= <
john@confluent.io
> <= ;mailto:john@confluent.io>> = wrote:
>
> Hey Daniyar,
>
> Looks good to me! Thank= s for considering it.
>
> Thanks,
> -John
>
>= On Fri, Jun 21, 2019 at 9:04 AM Development <dev@yeralin.net
> <mailto:dev@yeralin.net> <mailto:dev@yeralin.net>> wrote:
> Hey John and Matthias,
>= ;
> Yes, now I see it all. I=E2=80=99m storing lots of redundant info= rmation.
> Here is my final idea. Yes, now a user should pass a list = type. I
> realized that=E2=80=99s the type is not really needed in Li= stSerializer, but
> only in ListDeserializer:
>
>
>= In ListSerializer we will start storing sizes only if serializer is
>= ; not a primitive serializer:
>
>
> Then, in deserializer= , we persist passed list type, so that during
> deserialization we co= uld create an instance of it with predefined
> listSize for better pe= rformance.
> We also try to locate a primitiveSize based on passed de= serializer.
> If it is not there, then primitiveSize will be null. Wh= ich means
> that each entry=E2=80=99s size was encoded individually.<= br>>
>
> This looks much cleaner and more concise.
>> What do you think?
>
> Best,
> Daniyar Yeralin
= >
> On Jun 20, 2019, at 5:45 PM, Matthias J. Sax <matthias@confluent.io
> <mailto:matthias@confluent.io> <ma= ilto:matthias@confluent.io>= > wrote:
>
> For encoding the list-type: I see John's po= int about re-encoding the
> list-type redundantly. However, I also do= n't like the idea that the
> Deserializer returns a fixed type...=
>
> Maybe it's best allow users to specify the target list= type on
> deserialization via config?
>
> Similar for th= e primitive types: I don't think we need to encode the
> type siz= e, but users could specify the type on the deserializer (via a
> conf= ig again)?
>
>
> About generics: nesting could be arbitra= rily deep. Hence, I doubt
> we can
> support this and a cast wi= ll be necessary at some point in the user
> code.
>
>
= >
> -Matthias
>
>
>
> On 6/20/19 1:21 PM, = John Roesler wrote:
>
> Hey Daniyar,
>
> Thanks for= looking at it!
>
> Something like your screenshot is more alon= g the lines of what I was
> thinking. Sorry, but I didn't follow = what you mean, how would that not
> be "vanilla java"?
&= gt;
> Unfortunately the deserializer needs more information, though. = For
> example, what if the inner type is a Map<String,String>? = The serde
> could
> only be used to produce a LinkedList<Map= >, thus, we'd still need an
> inner serde, like you have in th= e KIP (Serde<T> innerSerde).
>
> Something more like Serd= e<LinkedList<MyRecord>> =3D Serdes.listSerde(
> /**list t= ype**/ LinkedList.class,
> /**inner serde**/ new MyRecordSerde()
&= gt; )
>
> And in configuration, it's something like:
>= ; default.key.serde: org...ListSerde
> default.key.list.serde.type: j= ava.util.LinkedList
> default.key.list.serde.inner: com.mycompany.MyR= ecordSerde
>
>
> What do you think?
> Thanks,
&g= t; -John
>
> On Thu, Jun 20, 2019 at 2:46 PM Development <dev@yeralin.net
> <mailto:dev@yeralin.net> <mailto:dev@yeralin.net>
> <mailto:dev@yeralin.net <mailto:dev@yeralin.net>>> wrote:
>
>= ; =C2=A0Hey John,
>
> =C2=A0I gave read about TypeReference. It= could work for the list serde.
> =C2=A0However, it is not directly> =C2=A0supported:
> https://github.com/FasterXML/jackson-databind/is= sues/1490
> <https://github.com/FasterXML/jackson-databind/issues/1= 490>
> =C2=A0The only way is to pass an actual class object in= to the constructor,
> =C2=A0something like:
>
> =C2=A0It = could be an option, but not a pretty one. What do you think of my
> = =C2=A0approach to use vanilla java and canonical class name? (As described<= br>> =C2=A0previously)
>
> =C2=A0Best,
> =C2=A0Daniyar= Yeralin
>
> =C2=A0On Jun 20, 2019, at 2:45 PM, Development <= ;dev@yeralin.net
> <mailto:= dev@yeralin.net> <mailto:dev@yeralin.net>
> =C2=A0<mai= lto:dev@yeralin.net <mailto:dev@yeralin.net>>> wrote:
>= ;
> =C2=A0Hi John,
>
> =C2=A0Thank you for your input! Ye= s, my idea looks a little bit over
> =C2=A0engineered :)
>
&= gt; =C2=A0I also wanted to see a feedback from Mathias as well since he gav= e
> =C2=A0me an idea about storing fixed/variable size entries.
&g= t;
> =C2=A0Best,
> =C2=A0Daniyar Yeralin
>
> =C2=A0= On Jun 18, 2019, at 6:06 PM, John Roesler <john@confluent.io
> <mailto:john@confluent.io> <mailto:john@confluent.io>
> =C2=A0<mailto:john@confluent.io <mailto:john@confluent.io>>> wrote:
>
> = =C2=A0Hi Daniyar,
>
> =C2=A0That's a very clever solution!<= br>>
> =C2=A0One observation is that, now, this is what we might c= all a
> =C2=A0polymorphic
> =C2=A0serde. That is, you're de= tecting the actual concrete type and then
> =C2=A0promising to produc= e the exact same concrete type on read.
> There are
> =C2=A0som= e inherent problems with this approach, which in general
> require> =C2=A0some kind of =C2=A0schema registry (not necessarily Schema
&= gt; Registry, just
> =C2=A0any registry for schemas) to solve.
>= ;
> =C2=A0Notice that every serialized record has quite a bit of dupl= icated
> =C2=A0information: the concrete type as well as a byte to in= dicate
> whether
> =C2=A0the value type is a fixed size, and, i= f so, an integer to
> =C2=A0indicate the
> =C2=A0actual size. T= hese constitute a schema, of sorts, because they
> =C2=A0tell us
&= gt; =C2=A0later how exactly to deserialize the data. Unfortunately, this> =C2=A0information is completely redundant. In all likelihood, the
= > =C2=A0information will be exactly the same for every record in the
= > topic.
> =C2=A0This problem is essentially the core motivation f= or serializations
> =C2=A0like Avro: to move the schema outside of th= e serialization
> itself, so
> =C2=A0that the records won't= contain so much redundant information.
>
> =C2=A0In this light= , I'm wondering if it makes sense to go back to
> =C2=A0something=
> =C2=A0like what you had earlier in which you don't support per= fectly
> =C2=A0preserving the concrete type for _this_ serde, but ins= tead just
> =C2=A0support deserializing to _some_ List. Then, you cou= ld defer full,
> =C2=A0perfect, type preservation to serdes that have= an external
> system in
> =C2=A0which to register their type i= nformation.
>
> =C2=A0There does exist an alternative, if we re= ally do want to
> preserve the
> =C2=A0concrete type (which doe= s seem kind of nice). You can add a
> =C2=A0configuration option spec= ifically for the serde to configure
> what the
> =C2=A0list typ= e will be, and maybe what the element type is, as well.
>
> =C2= =A0As far as "related work" goes, you might be interested to take=
> a look
> =C2=A0at how Jackson can be configured to deseriali= ze into a specific,
> =C2=A0arbitrarily nested, generically parameter= ized class structure.
> =C2=A0Specifically, you might find
> = =C2=A0https://fasterxml.github.i= o/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type/TypeReference.= html
> <https://fas= terxml.github.io/jackson-core/javadoc/2.0.0/com/fasterxml/jackson/core/type= /TypeReference.html>
> =C2=A0interesting.
>
> =C2= =A0Thanks,
> =C2=A0-John
>
> =C2=A0On Mon, Jun 17, 2019 a= t 12:38 PM Development <dev@yeralin.n= et
> <mailto:dev@yeralin.ne= t> <mailto:dev@yeralin.net= >
> =C2=A0<mailto:dev@yerali= n.net <mailto:dev@yeralin.net= >>> wrote:
>
>
> =C2=A0bump
>
>
&= gt;
>
>
>
>
>
--000000000000628e16058d46a37b-- --000000000000628e19058d46a37c--