flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
Date Sat, 28 Apr 2018 16:08:00 GMT

    [ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457676#comment-16457676

ASF GitHub Bot commented on FLINK-8690:

GitHub user walterddr opened a pull request:


    [FLINK-8690][table]Support DistinctAgg on DataStream

    ## What is the purpose of the change
    * Allow FlinkLogicalAggregate to support distinct aggregations on DataStream, while keeping
DataSet to decompose distinct aggs into GROUP BY follow by normal aggregates.
    ## Brief change log
      - Moved `AggregateExpandDistinctAggregatesRule.JOIN` to `DATASET_NORM_RULES`
      - Enabled `DataStreamGroupWindowAggregate` to support distinct agg while maintaining
unsupported for `[DataStream/DataSet]GroupAggregate`.
      - Fixed typo in codegen for distinct aggregate when merge
      - Fixed a possible codegen test error for `UNION ALL`.
    ## Verifying this change
      - Unit-test are added for `DistinctAggregateTest`
      - Added ITCase for distinct group window agg
    ## Does this pull request potentially affect one of the following parts:
      - Dependencies (does it add or upgrade a dependency): no
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: no
      - The serializers: no
      - The runtime per-record code paths (performance sensitive): yes (codegen)
      - Anything that affects deployment or recovery: no
      - The S3 file system connector: no
    ## Documentation
      - Does this pull request introduce a new feature? yes
      - If yes, how is the feature documented? not yet, should we put in `Aggregate` section
or `Group Window` section? inputs are highly appreciated. Also distinct over aggregate is
bug-fixed in FLINK-8689 but not documented.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/walterddr/flink FLINK-8690

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5940
commit c517821d13341ae10b5d47acdbd0cc7d5bbe38b7
Author: Rong Rong <rongr@...>
Date:   2018-04-28T15:59:12Z

    moving AggregateExpandDistinctAggregatesRule.JOIN to DATASET_NORM_RULES, and enabled distinct
aggregate support for window aggregate over datastream


> Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg
on DataStream
> -----------------------------------------------------------------------------------------------------
>                 Key: FLINK-8690
>                 URL: https://issues.apache.org/jira/browse/FLINK-8690
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Rong Rong
>            Assignee: Rong Rong
>            Priority: Major
> Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not allow distinct
> We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate*,
to support unbounded distinct aggregation on datastream as well.

This message was sent by Atlassian JIRA

View raw message