kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephane Maarek <steph...@simplemachines.com.au>
Subject Re: Are defaults serde in Kafka streams doing more harm then good ?
Date Thu, 14 Jun 2018 03:34:33 GMT
Thanks Matthias and Guozhang

1) regarding having json protobuf or avro across the entire topology this
makes sense. I still wish the builder could take a 'defaultSerde' for value
and keys to make types explicit throughout the topology vs a class as
string in a properties. That might also help with Java types through the
topology as now we can infer that the default serde<T> implies T as the
operators are chained

1*) I still think as soon as a 'count' or any 'window' happens the user
needs to override the default serde which can be confusing for end users

2) I very much agree a type and serde map could be very useful.

2*) big scala user here but this will affect maybe 10 percent of the user
unfortunately. Java is still where people try most things out. Still very
excited for that release !

3) haven't dug through the code, but how easy would it be to indicate to
the end user that a default serde was used during a runtime error ? This
could be a very quick kip-less win for the developers

On Thu., 14 Jun. 2018, 12:28 am Guozhang Wang, <wangguoz@gmail.com> wrote:

> Hello St├ęphane,
> Good question :) And there have been some discussions about the default
> serdes in the past in the community, my two cents about this:
> 1) When a user tries out Streams for the first time she is likely to use
> some primitive typed data as her first POC app, in which case the data
> types of the intermediate streams can change frequently and hence a default
> serde would not help much but may introduce confusions; on the other hand,
> in real production environment users are likely to use some data schema
> system like Avro / Protobuf, and hence their declared serde may well be
> consistent. For example if you are using Avro with GenericRecord, then all
> the value types throughout your topology may be of the same type, so just
> declaring a `Serdes<GenericRecord, GenericRecord>` would help. Over time,
> this is indeed what we have seen from practical user scenarios.
> 2) So to me the question is for top-of-the-funnel adoptions, could we make
> the OOTB experience better with serdes for users. We've discussed some
> ideas around this topic, like improving our typing systems so that users
> can specify some serdes per type (for primitive types we can pre-register a
> list of default ones as well), and the library can infer the data types and
> choose which serde to use automatically. However for Java type erasure
> makes it tricky (I think it is still the case in Java8), and we cannot
> always make it work. And that's where we paused on investigating further.
> Note that in the coming 2.0 release we have a Scala API for Streams where
> default serdes are indeed dropped since with Scala we can safely rely on
> implicit typing inference to override the serdes automatically.
> Guozhang
> On Tue, Jun 12, 2018 at 6:32 PM, Stephane Maarek <
> stephane@simplemachines.com.au> wrote:
> > Hi
> >
> > Coming from a user perspective, I see a lot of beginners not
> understanding
> > the need for serdes and misusing the default serde settings.
> >
> > I believe default serdes do more harm than good. At best, they save a bit
> > of boilerplate code but hide the complexity of serde happening at each
> > step. At worst, they generate confusion and make debugging tremendously
> > hard as the errors thrown at runtime don't indicate that the serde being
> > used is the default one.
> >
> > What do you think of deprecating them as well as any API that does not
> use
> > explicit serde?
> >
> > I know this may be a "tough change", but in my opinion it'll allow for
> more
> > explicit development and easier debugging.
> >
> > Regards
> > St├ęphane
> >
> --
> -- Guozhang

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message