beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <>
Subject [jira] [Commented] (BEAM-2021) Fix Java's Coder class hierarchy
Date Thu, 20 Apr 2017 20:41:04 GMT


Kenneth Knowles commented on BEAM-2021:

Yea, that's pretty much it. We'll have an added layer of "CoderEncoders" registered as services.

For core things, somewhat easy:
 - In SDK core: KvCoder with just basic accessors getKeyCoder() etc.
 - In core construction:
   - KvCoderEncoder that knows the URN and to put the getKeyCoder() and getValueCoder() into
the component coders
   - If there were some payload that needed to be added, it can be built directly as a proto,
so that it is cross-SDK (and cross-runner if they need to know about it). What the payload
might be depends on the URN, so you can have a HeapCoder with explicit component coders but
also a java-serialized comparator.

For non-core stuff like IOs or library transforms, it should be similar. In the extension
library, include:
 - The FooCoder with its natural interface
 - A registered service for its FooCoderEncoder

So what I mean by the tricky bit is the design decision between:
 - CoderEncoder interface lives in the SDK and does not have proto on its API surface (or
we figure out a way for this to be safe)
 - CoderEncoder interface lives in core construction (or another module...) and IOs that want
to have cross-language/grokkable/compact encodings take a dependency
 - Other option?

> Fix Java's Coder class hierarchy
> --------------------------------
>                 Key: BEAM-2021
>                 URL:
>             Project: Beam
>          Issue Type: Improvement
>          Components: beam-model-runner-api, sdk-java-core
>    Affects Versions: First stable release
>            Reporter: Kenneth Knowles
>            Assignee: Thomas Groh
> This is thoroughly out of hand. In the runner API world, there are two paths:
> 1. URN plus component coders plus custom payload (in the form of component coders alongside
an SdkFunctionSpec)
> 2. Custom coder (a single URN) and payload is serialized Java. I think this never has
component coders.
> The other base classes have now been shown to be extraneous: they favor saving ~3 lines
of boilerplate for rarely written code at the cost of readability. Instead they should just
be dropped.
> The custom payload is an Any proto in the runner API. But tying the Coder interface to
proto would be unfortunate from a design perspective and cannot be done anyhow due to dependency

This message was sent by Atlassian JIRA

View raw message