beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eugene Kirpichov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (BEAM-1824) Adapter for running SDF on a statically known input as a Source
Date Mon, 01 May 2017 20:03:04 GMT

     [ https://issues.apache.org/jira/browse/BEAM-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Eugene Kirpichov updated BEAM-1824:
-----------------------------------
    Description: 
[~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65]
is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we
can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.

This will allow us to start transitioning bounded IO connectors to SDF API while preserving
the dynamic rebalancing feature in the common case when the source is known at pipeline submission
time.

And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors for
reading from a PCollection of inputs, and those will enjoy the same benefits. Actually we
can add such APIs earlier, with the caveat that they won't support dynamic rebalancing, but
in this case it's ok because there'll be no performance regression because these APIs didn't
exist before.

Proposal document: http://s.apache.org/sdf-via-source

  was:
[~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65]
is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we
can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.

This will allow us to start transitioning bounded IO connectors to SDF API while preserving
the dynamic rebalancing feature in the common case when the source is known at pipeline submission
time.

And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors for
reading from a PCollection of inputs, and those will enjoy the same benefits. Actually we
can add such APIs earlier, with the caveat that they won't support dynamic rebalancing, but
in this case it's ok because there'll be no performance regression because these APIs didn't
exist before.

        Summary: Adapter for running SDF on a statically known input as a Source  (was: Adapter
for running SDF on a statically known input as a BoundedSource)

> Adapter for running SDF on a statically known input as a Source
> ---------------------------------------------------------------
>
>                 Key: BEAM-1824
>                 URL: https://issues.apache.org/jira/browse/BEAM-1824
>             Project: Beam
>          Issue Type: New Feature
>          Components: runner-dataflow, sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Eugene Kirpichov
>
> [~bchambers] suggested the following idea: while the runner implementation of SDF [BEAM-65]
is not yet complete enough to support dynamic rebalancing (especially over the Fn API), we
can special-case the case of Create.of(single input) + ParDo(SDF) by running it via BoundedSource.
> This will allow us to start transitioning bounded IO connectors to SDF API while preserving
the dynamic rebalancing feature in the common case when the source is known at pipeline submission
time.
> And then, when SDF runner support catches up, we'll simply add APIs to the IO connectors
for reading from a PCollection of inputs, and those will enjoy the same benefits. Actually
we can add such APIs earlier, with the caveat that they won't support dynamic rebalancing,
but in this case it's ok because there'll be no performance regression because these APIs
didn't exist before.
> Proposal document: http://s.apache.org/sdf-via-source



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message