beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1211) Improve Coder efficiency in outer context for StandardCoder
Date Wed, 28 Dec 2016 23:31:58 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15783985#comment-15783985
] 

ASF GitHub Bot commented on BEAM-1211:
--------------------------------------

Github user asfgit closed the pull request at:

    https://github.com/apache/beam/pull/1680


> Improve Coder efficiency in outer context for StandardCoder
> -----------------------------------------------------------
>
>                 Key: BEAM-1211
>                 URL: https://issues.apache.org/jira/browse/BEAM-1211
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-java-core
>            Reporter: Daniel Halperin
>            Assignee: Daniel Halperin
>            Priority: Minor
>              Labels: backward-incompatible
>
> Many coders are actually composite coders that use multiple other coders, in their internal
implementation. For example, a {{KvCoder}} is really just a key coder and a value coder applied
in succession. {{StandardCoder}} is a parent abstract class that captures this coder design
pattern.
> If a coder has {{N > 1}} component coders, the first {{N-1}} of these coders must
be used in the nested context. For the example of KV, the keyCoder, will (almost) *never*
own the whole stream -- after the keyCoder has written its encoded bytes, the valueCoder owns
the next bytes. So the keyCoder must always operate in the nested context.
> However, the final coder can actually inherit the context of the parent. Back to the
{{KvCoder}} example: if the {{KvCoder}} owns the whole stream, then its valueCoder owns the
whole stream following the bytes of the keyCoder. If the {{KvCoder}} does not own the whole
stream, then the valueCoder obviously does not either. In either case, the valueCoder should
use the context of the {{KvCoder}}.
> Operating in the outer context is often more efficient in both bytes and computation,
so passing the parent's context to the child can be an optimization. This issue is to audit
{{StandardCoder}} implementations and apply this optimization to the last coder where possible.
> Refer to [BEAM-469] for a detailed discussion of this issue in the context of {{NullableCoder}}.
> (About the "almost never" above: There is an exception for the case of a ValueCoders
that always encodes to zero bytes..like the global window coder or the void coder, this this
general principle holds. However, it is not obvious how to optimize for that case at the moment,
though there are probably some coders where this is valuable.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message