beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-802) Support Dynamic PipelineOptions for python
Date Wed, 05 Apr 2017 21:50:41 GMT


ASF GitHub Bot commented on BEAM-802:

GitHub user mariapython opened a pull request:

    [BEAM-802] Add ValueProvider class for FileBasedSource I/O Transforms

    - [x] Add ValueProvider class.
    - [x] Derive StaticValueProvider and RuntimeValueProvider from ValueProvider.
    - [x] Derive ValueProviderArgumentParser from argparse.ArgumentParser as API for the template
    - [x] Modify FileBasedSource I/O transforms to accept objects of type ValueProvider.
    - [x] Modify display_data.
    - [x] Handle serialization / deserialization.
    Note: #2441 addresses the failure from the previous version of this PR, and the rest of
the issue is tracked in [BEAM-1889].

You can merge this pull request into a Git repository by running:

    $ git pull ppp_fix_equal

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2443
commit 57838f5be4d1854eec8a51e3aa8e583a82be38d0
Author: Maria Garcia Herrero <>
Date:   2017-04-05T21:42:13Z

    Revert "Revert "Add ValueProvider class for FileBasedSource I/O Transforms""
    This reverts commit 3eef246f761f92c626541e9008f8624b43bdcc09.


> Support Dynamic PipelineOptions for python
> ------------------------------------------
>                 Key: BEAM-802
>                 URL:
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py
>            Reporter: María GH
>            Assignee: María GH
>            Priority: Minor
>   Original Estimate: 1,680h
>  Remaining Estimate: 1,680h
> Goal:  Enable users to run pipelines from templates filled via CL (pipeline options)
> Background: Currently, the Runner creates the JSON pipeline description which can be
sent to the worker as is, since everything is already defined there (with links to gs:// for
input and binaries). With the parametrized approach, those descriptions are empty and filled
by the user or defaulted, so the pipeline needs to be stored somewhere first until the values
become available.
> Tasks:
> 1- Create template-style pipeline description (TemplateRunner)
> The graph description is now a template (some parts are not filled) that needs to be
> 2- Define values to inject to the template (ValueProviders API)
> The placeholders can be filled with default values (static) or with dynamic key/value
pairs provided at runtime (dynamic)

This message was sent by Atlassian JIRA

View raw message