beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Luke Cwik (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (BEAM-1314) DoFn per-key lifecycle (Setup/Teardown)
Date Wed, 14 Jun 2017 01:36:00 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16048581#comment-16048581
] 

Luke Cwik edited comment on BEAM-1314 at 6/14/17 1:35 AM:
----------------------------------------------------------

I believe this is more of an implementation detail of how the Java SDK works. Nothing in the
model says that a DoFn which uses user state can't be re-used across different keys within
the same bundle. I believe the flaws are within the usage of a KeyedWorkItem and how the current
key propagates through all the context objects within Java SDK runners core.


was (Author: lcwik):
I believe this is more of an implementation detail of how the Java SDK works. Nothing in the
model says that a DoFn which uses user state can't be re-used across different keys within
the same bundle. I believe the flaws are within the usage of a KeyedWorkItem and how the current
key propagates through all the context objects within the Java SDK runners core.

> DoFn per-key lifecycle (Setup/Teardown)
> ---------------------------------------
>
>                 Key: BEAM-1314
>                 URL: https://issues.apache.org/jira/browse/BEAM-1314
>             Project: Beam
>          Issue Type: Wish
>          Components: beam-model
>            Reporter: Eugene Kirpichov
>
> DoFn's that use state and timers are implicitly per-key. Setup/Teardown methods are usually
used to establish expensive resources - long-standing connections and such.
> For per-key DoFn's, we'd often want to use these per-key, so it'd be good to have ability
in the model to ask that there be 1 instance of the DoFn per key, reused between e.g. different
timer or trigger firings for this key, but not used for other keys.
> E.g. this would be particularly useful for Splittable DoFn - for its ability to reuse
expensive resources between checkpoints.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message