flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5266) Eagerly project unused fields when selecting aggregation fields
Date Thu, 08 Dec 2016 02:47:59 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730904#comment-15730904

ASF GitHub Bot commented on FLINK-5266:

GitHub user KurtYoung opened a pull request:


    [FLINK-5266] [table] eagerly project unused fields when selecting aggregation fields

    This PR is based on #2926 , only the second commit is related.
    I add a "plan" test directory to hold all the plan level tests. And i also did a small
refactory for ProjectionTranslator, thought it's better to keep each method only do one thing.
    @fhueske As we discussed earlier in the jira: https://issues.apache.org/jira/browse/FLINK-5266
about where the logics should be added. I decided to add them when we selecting fields from
a normal or grouped table. Since this kind of logics involves some fields references rewrite,
if we choose to add the needed projection node when we convert the LogicalPlan to Calcite's
RelNode, we should also take care the whole rewrite thing. 
    However, if we add the project node in the first place, we only need to extract all the
field references used in all selecting expressions, and treat them as UnresolvedFieldReferences.
The validation part will take care of the rewrite thing. I think this will be easier and more
consistent with other procedures. (Noticed all the "construct" logic are fairly simple)

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/KurtYoung/flink flink-5266

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2961
commit 374d231d44f84ae385d8f8adb2353685e1214ff6
Author: Kurt Young <ykt836@gmail.com>
Date:   2016-12-08T01:27:55Z

    [FLINK-5226] [table] Use correct DataSetCostFactory and improve DataSetCalc costs.

commit 8a3ecf8e6362acd9370b11f08018a0143fc9be18
Author: Kurt Young <ykt836@gmail.com>
Date:   2016-12-08T02:35:43Z

    [FLINK-5266] [table] eagerly project unused fields when selecting aggregation fields
    Add a "plan" test dir to hold all the plan level unit tests
    Small refactory with ProjectionTranslator, keep each method handle one single thing


> Eagerly project unused fields when selecting aggregation fields
> ---------------------------------------------------------------
>                 Key: FLINK-5266
>                 URL: https://issues.apache.org/jira/browse/FLINK-5266
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: Kurt Young
>            Assignee: Kurt Young
> When we call table's {{select}} method and if it contains some aggregations, we will
project fields after the aggregation. Would be better to project unused fields before the
aggregation, and can furthermore leave the opportunity to push the project into scan.
> For example, the current logical plan of a simple query:
> {code}
> table.select('a.sum as 's, 'a.max)
> {code}
> is
> {code}
> LogicalProject(s=[$0], TMP_2=[$1])
>   LogicalAggregate(group=[{}], TMP_0=[SUM($5)], TMP_1=[MAX($5)])
>     LogicalTableScan(table=[[supplier]])
> {code}
> Would be better if we can project unused fields right after scan, and looks like this:
> {code}
> LogicalProject(s=[$0], EXPR$1=[$0])
>   LogicalAggregate(group=[{}], EXPR$1=[SUM($0)])
>     LogicalProject(a=[$5])
>       LogicalTableScan(table=[[supplier]])
> {code}

This message was sent by Atlassian JIRA

View raw message