beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Cwik (JIRA)" <>
Subject [jira] [Commented] (BEAM-3227) Consider sharing Udf/SkdFunctionSpec records via pointer
Date Mon, 20 Nov 2017 17:06:00 GMT


Luke Cwik commented on BEAM-3227:

That makes a lot of sense.

> Consider sharing Udf/SkdFunctionSpec records via pointer
> --------------------------------------------------------
>                 Key: BEAM-3227
>                 URL:
>             Project: Beam
>          Issue Type: Sub-task
>          Components: beam-model
>            Reporter: Kenneth Knowles
> Coders are stored by pointer, because they are often repeated and a common source of
huge pipeline descriptions.
> We considered doing the same for all UDFs but decided not to, based on the logic that
they are not as often identical and will rarely implement the equals() needed to actually
share encoded versions.
> However, in the presence of generated code, it is very likely that DoFns and CombineFns
are repeated, and also much more likely that they have meaningful equals(), so there could
be size savings.
> None of this is terribly important for storage or transmission, but has more to do with
arbitrary and small size limits that occur in some API frameworks or database column types.

This message was sent by Atlassian JIRA

View raw message