beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "peay (JIRA)" <>
Subject [jira] [Commented] (BEAM-1573) KafkaIO does not allow using Kafka serializers and deserializers
Date Mon, 06 Mar 2017 18:15:33 GMT


peay commented on BEAM-1573:

[~rangadi] OK, that makes sense. I was hoping to try and keep everything as a single step,
for instance to be able to leverage {code}withTimestampFn{code}, but I'll go with (1) for

If the long term plan is to remove the use of coders in read/write and allow to pass in Kafka
serializers directly, this was my original point, so all the better. I am happy to work on
a PR for that if you want me to. I think {code}KafkaIO{code} can still provide a typed reader/writer
with {code}withCustomKafkaValueSerializer{code} like methods, to avoid the extraneous ParDo
and having to call a utility to get something else than `byte[]`, which is assume is often
going to be the case.

The main issue I see is that removing {code}withValueCoder{code} and so on will break API
compatibility, not sure what the project's policy is on that [~jkff]? A deprecation phase
of a couple releases, and then breaking changes?

> KafkaIO does not allow using Kafka serializers and deserializers
> ----------------------------------------------------------------
>                 Key: BEAM-1573
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 0.4.0, 0.5.0
>            Reporter: peay
>            Assignee: Raghu Angadi
>            Priority: Minor
> KafkaIO does not allow to override the serializer and deserializer settings of the Kafka
consumer and producers it uses internally. Instead, it allows to set a `Coder`, and has a
simple Kafka serializer/deserializer wrapper class that calls the coder.
> I appreciate that allowing to use Beam coders is good and consistent with the rest of
the system. However, is there a reason to completely disallow to use custom Kafka serializers
> This is a limitation when working with an Avro schema registry for instance, which requires
custom serializers. One can write a `Coder` that wraps a custom Kafka serializer, but that
means two levels of un-necessary wrapping.
> In addition, the `Coder` abstraction is not equivalent to Kafka's `Serializer` which
gets the topic name as input. Using a `Coder` wrapper would require duplicating the output
topic setting in the argument to `KafkaIO` and when building the wrapper, which is not elegant
and error prone.

This message was sent by Atlassian JIRA

View raw message