flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-2677) Add a general-purpose keyed-window operator
Date Fri, 25 Sep 2015 15:28:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908157#comment-14908157
] 

ASF GitHub Bot commented on FLINK-2677:
---------------------------------------

GitHub user aljoscha opened a pull request:

    https://github.com/apache/flink/pull/1184

    [FLINK-2677] Add a general-purpose keyed-window operator

    This adds a new window operator that works like a combination of the Google Dataflow windowing
semantics and our existing trigger/eviction semantics.
    
    There are now two styles of API, the old policy based style:
    ```java
    DataStream<MyType> stream = ...;
    
    stream.keyBy("id")
          .window(Time.of(5, SECONDS), Count.of(10))
          .reduceWindow( (a, b) -> a.fuse(b) )
    ```
    
    and a new assigner/trigger API:
    ```java
    DataStream<MyType> stream = ...;
    
    stream.keyBy("id")
          .window(SlidingTimeWindows.of(1000, 100))
          .trigger(Count.of(100))
          .evictor(DeltaEvictor.of(...))
          .reduceWindow( (a, b) -> a.fuse(b) )
    ```
    (both trigger and evictor are optional, window assigners have default triggers specified).
    
    The new operator basically works by grouping elements by key and window where the key
is assigned by the `KeySelector` and the window is assigned by the `WindowAssigner`. The `Trigger`
specifies when one of the groups that emerge from this should fire. The windows for different
groups can fire at arbitrary times.
    
    The policy based windows are translated to either @StephanEwen's new fast aligned time
windows operator or the generic assigner/trigger/evictor operator, based on what is possible.
    
    I added tests for the `WindowOperator` itself in `WindowOperatorTest` and `EvictingWindowOperatorTest`.
The translation from the API to correct window operator is verified in `PolicyWindowTranslationTest`
and `TriggerWindowTranslationTest`.
    
    This does not yet change the documentation since the old API is still completely intact.
The new API is added in parallel to the old API. Once we write the documentation the old code
should be removed.
    
    This subsumes #1175 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/aljoscha/flink windowing-next-cleanup

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1184.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1184
    
----
commit 61c3666a272413940e65a2195b87c7472a8e8806
Author: Stephan Ewen <sewen@apache.org>
Date:   2015-09-23T10:05:54Z

    [FLINK-2753] [streaming] [api breaking] Add first parts of new window API for key grouped
windows
    
    This follows the API design outlined in https://cwiki.apache.org/confluence/display/FLINK/Streams+and+Operations+on+Streams
    
    This is API breaking because it adds new generic type parameters to Java and Scala classes,
breaking binary compatibility.

commit 1f92f28451a39f20f05407b221fc2f583650f003
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-24T14:30:29Z

    Fix some typos and naming inconsistencies in new Windowing Code

commit f48c9f9109d51077db9dd4cf8b4f577942da4961
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-24T14:33:09Z

    Move window operators and tests to windowing package
    
    The api package is also called windowing, this harmonizes the package
    names.

commit 31dff7a1eaf5d72eed0872a2064df44f1c99ae72
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-24T14:40:46Z

    Harmonize generic parameter names in Stream API classes

commit ca13af49a477b863fe7790d533c47cb837a6ffb9
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-24T15:42:15Z

    Add Count and Delta  WindowPolicy

commit 38caf8b5bbab35b115836b1c3a279a532c7a0930
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-24T15:39:19Z

    Add Window parameter to KeyedWindowFunction, move to package windowing

commit 329bd1f0a0cc1e29e279bd73210660d79de12788
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-25T09:34:10Z

    Move delta window functions to package functions.windowing.delta

commit e1c643335c57c5bb6f2f2e6012485202a33bab78
Author: Aljoscha Krettek <aljoscha.krettek@gmail.com>
Date:   2015-09-25T10:27:35Z

    [FLINK-2677] Add a general-purpose keyed-window operator

----


> Add a general-purpose keyed-window operator
> -------------------------------------------
>
>                 Key: FLINK-2677
>                 URL: https://issues.apache.org/jira/browse/FLINK-2677
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Streaming
>    Affects Versions: 0.10
>            Reporter: Stephan Ewen
>            Assignee: Aljoscha Krettek
>             Fix For: 0.10
>
>
> This operator should support:
>   - Customizable triggers
>   - Eviction on triggers: all / none / custom
>   - Discard by time (expiry of state)
>   - Event time time window assignment
> This set of requirements is effectively a mix between the current trigger/evict model
and the Cloud Dataflow window definition model.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message