beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kenneth Knowles (JIRA)" <>
Subject [jira] [Created] (BEAM-3227) Consider sharing Udf/SkdFunctionSpec records via pointer
Date Mon, 20 Nov 2017 16:23:00 GMT
Kenneth Knowles created BEAM-3227:

             Summary: Consider sharing Udf/SkdFunctionSpec records via pointer
                 Key: BEAM-3227
             Project: Beam
          Issue Type: Sub-task
          Components: beam-model
            Reporter: Kenneth Knowles

Coders are stored by pointer, because they are often repeated and a common source of huge
pipeline descriptions.

We considered doing the same for all UDFs but decided not to, based on the logic that they
are not as often identical and will rarely implement the equals() needed to actually share
encoded versions.

However, in the presence of generated code, it is very likely that DoFns and CombineFns are
repeated, and also much more likely that they have meaningful equals(), so there could be
size savings.

None of this is terribly important for storage or transmission, but has more to do with arbitrary
and small size limits that occur in some API frameworks or database column types.

This message was sent by Atlassian JIRA

View raw message