beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BEAM-1925) Make DoFn invocation logic of Python SDK more extensible
Date Wed, 26 Apr 2017 20:21:04 GMT

    [ https://issues.apache.org/jira/browse/BEAM-1925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15985490#comment-15985490
] 

ASF GitHub Bot commented on BEAM-1925:
--------------------------------------

GitHub user sb2nov opened a pull request:

    https://github.com/apache/beam/pull/2712

    [BEAM-1925] Remove deprecated context param from DoFn

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [ ] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
    
    ---
    
    R: @chamikaramj @robertwb PTAL

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sb2nov/beam BEAM-1925-remove-deprecated-context-param

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2712.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2712
    
----
commit d5a6e0917642a3d18d93ac9ebabb209364560e77
Author: Sourabh Bajaj <sourabhbajaj@google.com>
Date:   2017-04-26T20:20:01Z

    [BEAM-1925] Remove deprecated context param from DoFn

----


> Make DoFn invocation logic of Python SDK more extensible
> --------------------------------------------------------
>
>                 Key: BEAM-1925
>                 URL: https://issues.apache.org/jira/browse/BEAM-1925
>             Project: Beam
>          Issue Type: Improvement
>          Components: sdk-py
>            Reporter: Chamikara Jayalath
>            Assignee: Chamikara Jayalath
>
> DoFn invocation logic of Python SDK is currently in DoFnRunner class.
> https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L54
> At initialization of this, we parse a DoFn and create local state. We use this state
when invoking DoFn methods process, start_bundle, and finish_bundle. For example, we store
a list of  ArgPlaceholder objects within the state of DoFnRunner to facilitate invocation
of process method.
> We will need to extend this functionality when adding new features to DoFn class (for
example to support Splittable DoFn [1]). So I think it's good to refactor this code to be
more extensible. 
> I think a good approach for this is to add DoFnInvoker and DoFnSignature classes similar
to Java SDK [2].
> In this approach:
> A DoFnSignature captures the signature of a DoFn including methods and arguments.
> A DoFnInvoker implements a particular way DoFn methods will be executed (initially we'll
have simple and per-window invokers [3]).
> A runner uses DoFnRunner to execute methods of a given DoFn. At initialization, DoFnRunner
crates a DoFnSignature and a DoFnInvoker for the given DoFn.
> DoFnSignature and DoFnInvoker methods will be used by SplittableDoFn implementation as
well. 
> [1] https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit#heading=h.e6patunrpiql
> [2]https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/reflect/DoFnSignature.java
> [3] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/common.py#L200



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message