beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-1983) SDF should properly support windowed side inputs
Date Mon, 17 Apr 2017 22:43:41 GMT


ASF GitHub Bot commented on BEAM-1983:

GitHub user jkff opened a pull request:

    [BEAM-1983] Splittable DoFn changes for properly supporting side inputs


You can merge this pull request into a Git repository by running:

    $ git pull spd-fixes

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2556
commit 30caf69146c4699993469571927ef3809a1e6e2d
Author: Eugene Kirpichov <>
Date:   2017-04-15T23:38:35Z

    Separates side input test and side output test

commit 670b2325c3e2d97a2c34cd318417f982072646fb
Author: Eugene Kirpichov <>
Date:   2017-04-15T23:39:51Z

    ProcessFn remembers more info about its application context

commit 89759ea0778c4bc3b0e9b0edebb7283569321baf
Author: Eugene Kirpichov <>
Date:   2017-04-17T19:25:02Z

    Minor cleanups in ParDoEvaluator

commit cc34d5f3f1dee423c5d6df1845acdf89da80f2e3
Author: Eugene Kirpichov <>
Date:   2017-04-17T21:41:53Z

    Extracts interface from PushbackSideInputDoFnRunner

commit 37cba1531a5249ecb085776259c345638c6d8623
Author: Eugene Kirpichov <>
Date:   2017-04-17T21:52:23Z

    Creates ProcessFnRunner and wires it through ParDoEvaluator

commit b26c6ac9d7004e1e5e986ffa6326aa1565335662
Author: Eugene Kirpichov <>
Date:   2017-04-17T18:28:24Z

    Explodes windows before GBKIKWI
    * Adds a test for windowed side inputs that requires this
    * Adds a test category for SDF with windowed side input.
      Runners should gradually implement this properly. For now
      only direct runner implements this properly.


> SDF should properly support windowed side inputs
> ------------------------------------------------
>                 Key: BEAM-1983
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-apex, runner-dataflow, runner-direct, runner-flink, sdk-java-core
>            Reporter: Eugene Kirpichov
>            Assignee: Eugene Kirpichov
> Currently there is no test coverage for Splittable DoFn + windowed side inputs, especially
when not all of the side input windows are ready.
> Moreover, current implementation of SDF in the direct runner is definitely wrong: it
uses a ParDoEvaluator to run the ProcessFn, and this ParDoEvaluator looks at the wrong windows
to decide which windows are ready and which are not:
- the WindowedValue in question is a KeyedWorkItem, and they are always in the global window,
but the important windows are windows of elements inside this KWI's elementsIterable().
> The Flink implementation is also wrong in the same way.
> This JIRA is to:
> 1) add test coverage for this case
> 2) implement proper support in all runners
> I believe the easiest way to do 2) is to:
> - make SplittableParDo, in case the DoFn has side inputs, pre-explode windows before
feeding them into GroupByKeyIntoKeyedWorkItems , so that the resulting KWI's have elements
only in a single window
> - tweak runners to look at the proper window, and assert that there's only one window,
while evaluating ProcessFn, in case the DoFn uses side inputs

This message was sent by Atlassian JIRA

View raw message