beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From María GH (JIRA) <j...@apache.org>
Subject [jira] [Updated] (BEAM-802) Support Dynamic PipelineOptions for python
Date Mon, 24 Oct 2016 15:16:58 GMT

     [ https://issues.apache.org/jira/browse/BEAM-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

María GH updated BEAM-802:
--------------------------
    Description: 
Goal:  Enable users to run pipelines from templates filled via CL (pipeline options)
Background: Currently, the Runner creates the JSON pipeline description which can be sent
to the worker as is, since everything is already defined there (with links to gs:// for input
and binaries). With the parametrized approach, those descriptions are empty and filled by
the user or defaulted, so the pipeline needs to be stored somewhere first until the values
become available.
Tasks:
1- Create template-style pipeline description (TemplateRunner)
The graph description is now a template (some parts are not filled) that needs to be saved.
2- Define values to inject to the template (ValueProviders API)
The placeholders can be filled with default values (static) or with dynamic key/value pairs
provided at runtime (dynamic)

  was:
Goal:  Enable users to run pipelines from templates filled via CL (pipeline options)
Background: Currently, the Runner creates the JSON pipeline description which can be sent
to the worker as is, since everything is already defined there (with links to gs:// for input
and binaries). With the parametrized approach, those descriptions are empty and filled by
the user or defaulted, so the pipeline needs to be stored somewhere first until the values
become available.
Tasks:
1- Create template-style pipeline description (TemplateRunner)
The graph description is now a template (some parts are not filled) that needs to be saved.
2- Define values to inject to the template (ValueProviders API)
The placeholders can be filled with default values (static) or with dynamic key/value pairs
provided at runtime (dynamic)
3- Adapt service
	1- Allow various IO classes to use accept placeholders and defer validation for dynamic values:

For Text: file.open(<some value>)
For BigQ: table.select(<some value>)
        2- Have a mechanism for the service to instantiate the template with the actual values,
so that the worker can start working on the pipeline.

    Component/s:     (was: beam-model)
                 sdk-py

> Support Dynamic PipelineOptions for python
> ------------------------------------------
>
>                 Key: BEAM-802
>                 URL: https://issues.apache.org/jira/browse/BEAM-802
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-py
>            Reporter: María GH
>            Priority: Minor
>   Original Estimate: 1,680h
>  Remaining Estimate: 1,680h
>
> Goal:  Enable users to run pipelines from templates filled via CL (pipeline options)
> Background: Currently, the Runner creates the JSON pipeline description which can be
sent to the worker as is, since everything is already defined there (with links to gs:// for
input and binaries). With the parametrized approach, those descriptions are empty and filled
by the user or defaulted, so the pipeline needs to be stored somewhere first until the values
become available.
> Tasks:
> 1- Create template-style pipeline description (TemplateRunner)
> The graph description is now a template (some parts are not filled) that needs to be
saved.
> 2- Define values to inject to the template (ValueProviders API)
> The placeholders can be filled with default values (static) or with dynamic key/value
pairs provided at runtime (dynamic)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message