flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kurt Young (JIRA)" <j...@apache.org>
Subject [jira] [Assigned] (FLINK-6066) Separate logical and physical RelNode layer in Flink
Date Mon, 20 Mar 2017 01:19:42 GMT

     [ https://issues.apache.org/jira/browse/FLINK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Kurt Young reassigned FLINK-6066:

    Assignee: Kurt Young

> Separate logical and physical RelNode layer in Flink
> ----------------------------------------------------
>                 Key: FLINK-6066
>                 URL: https://issues.apache.org/jira/browse/FLINK-6066
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API & SQL
>            Reporter: Kurt Young
>            Assignee: Kurt Young
> Currently flink-table contains two layer of RelNodes to work with Calcite. One is actually
from Calcite itself, such as TableScan, Project, Filter and so on. Then depends on what environment
we are using, the RelNode translate to DataSetXXX or DataStreamXXX, like DataSetScan or DataStreamAggregate.
All the optimization rules happened in the phase in a cost base manner. 
> I suppose to further separate the second layer into two, one is more logical just like
Calcite, and the other one is more physical. In the logical layer, we can do lots of optimization
without real statistics involved, like partition pruning, projection pushdown. And we may
even use rule-based optimization for logical optimize. In physical optimize phase, we then
introduce some real statistics and to choose what ever physical strategy we want to use in
a cost base manner, like join strategy selection or join reorder. 
> Since the complexity for cost base optimization grows exponentially when the plan is
complex. By separating the optimization can make it more efficient and easier to maintain.

This message was sent by Atlassian JIRA

View raw message