beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-270) Support Timestamps/Windows in Flink Batch
Date Tue, 17 May 2016 07:31:12 GMT


ASF GitHub Bot commented on BEAM-270:

GitHub user aljoscha opened a pull request:

    [BEAM-270] Support Timestamps/Windows in Flink Batch

    This is a cleanup version of #328, this time for real.
    The interesting things are in `FlinkPartialReduceFunction`/`FlinkReduceFunction`, `FlinkMergingPartialReduceFunction`/`FlinkMergingReduceFunction`
and `FlinkMergingNonShuffleReduceFunction`. All of these implement special cases of windowing.
The first two are for general, non-merging windows, the second set is for doing a `GroupByKey`,
the last one is for merging windows. In the last case we cannot do a pre-shuffle combine step.
    R: @kennknowles and @mxm for review

You can merge this pull request into a Git repository by running:

    $ git pull flink-windowed-value-batch-cleanup

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #343
commit 93c3f99a6be44b7aad7859927c69974d368f9903
Author: Kenneth Knowles <>
Date:   2016-05-02T20:11:12Z

    Add TestFlinkPipelineRunner to FlinkRunnerRegistrar
    This makes the runner available for selection by integration tests.

commit c48e1eaea4805359fdfc326d70b5d3c9964fe37f
Author: Kenneth Knowles <>
Date:   2016-05-02T21:04:20Z

    Configure RunnableOnService tests for Flink in batch mode
    Today Flink batch supports only global windows. This is a situation we
    intend our build to allow, eventually via JUnit category filtering.
    For now all the test classes that use non-global windows are excluded
    entirely via maven configuration. In the future, it should be on a
    per-test-method basis.

commit 4cc1acc8630a2e436acd75f5aeb4ee6b01a38dc5
Author: Aljoscha Krettek <>
Date:   2016-05-06T06:26:50Z

    Fix Dangling Flink DataSets

commit 508eebafee0a762a59d5a21a07c26f43981c304f
Author: Aljoscha Krettek <>
Date:   2016-05-06T07:38:55Z

    Add hamcrest dependency to Flink Runner
    Without it the RunnableOnService tests seem to not work

commit 3b1f064ca1f2985b6898d527f6174cc9055a1e4a
Author: Kenneth Knowles <>
Date:   2016-05-06T17:54:41Z

    Remove unused threadCount from integration tests

commit 99df86fc057d49fcf4e305d3523864d68cf5abd1
Author: Kenneth Knowles <>
Date:   2016-05-06T17:55:16Z

    Disable Flink streaming integration tests for now

commit 4b2eb1151e4cd7ef140a9e6e0eab251452ef7070
Author: Kenneth Knowles <>
Date:   2016-05-06T19:49:55Z

    Special casing job exec AssertionError in TestFlinkPipelineRunner

commit c45651fa434d91064a16f54d53b65f40eadad108
Author: Aljoscha Krettek <>
Date:   2016-05-10T11:53:03Z

    [BEAM-270] Support Timestamps/Windows in Flink Batch
    With this change we always use WindowedValue<T> for the underlying Flink
    DataSets instead of just T. This allows us to support windowing as well.
    This changes also a lot of other stuff enabled by the above:
     - Use WindowedValue throughout
     - Add proper translation for Window.into()
     - Make side inputs window aware
     - Make GroupByKey and Combine transformations window aware, this
       includes support for merging windows. GroupByKey is implemented as a
       Combine with a concatenating CombineFn, for simplicity
    This removes Flink specific transformations for things that are handled
    by builtin sources/sinks, among other things this:
     - Removes special translation for AvroIO.Read/Write and
     - Removes special support for Write.Bound, this was not working properly
       and is now handled by the Beam machinery that uses DoFns for this
     - Removes special translation for binary Co-Group, the code was still
       in there but was never used
    With this change all RunnableOnService tests run on Flink Batch.

commit 863aa2cb2a207449e9a711c4a9e248ed134939d4
Author: Aljoscha Krettek <>
Date:   2016-05-13T12:17:50Z

    Fix faulty Flink Flatten when PCollectionList is empty

commit 5c58830c2da4c0b86d80f93251b001f96edeef35
Author: Aljoscha Krettek <>
Date:   2016-05-13T12:41:20Z

    Remove superfluous Flink Tests, Fix those that stay in
    All of the stuff in the removed ITCases is covered (in more detail) by
    the RunnableOnService tests.

commit 5e6be8c757f89d933a4e6818cf7ef6316b7195d6
Author: Aljoscha Krettek <>
Date:   2016-05-14T09:48:47Z

    Fix last last outstanding test


> Support Timestamps/Windows in Flink Batch
> -----------------------------------------
>                 Key: BEAM-270
>                 URL:
>             Project: Beam
>          Issue Type: Sub-task
>          Components: runner-flink
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
> Right now, Flink Batch execution does not use {{WindowedValue}} internally, this means
that all programs that interact with timestamps/windows will not work. We should just internally
wrap everything in {{WindowedValue}} as we do in Flink Streaming. This also makes it very
straightforward to add support for windows.

This message was sent by Atlassian JIRA

View raw message