flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-6516) using real row count instead of dummy row count when optimizing plan
Date Wed, 10 May 2017 08:38:04 GMT

    [ https://issues.apache.org/jira/browse/FLINK-6516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004308#comment-16004308

ASF GitHub Bot commented on FLINK-6516:

Github user fhueske commented on a diff in the pull request:

    --- Diff: flink-libraries/flink-table/src/test/scala/org/apache/flink/table/utils/TestFilterableTableSource.scala
    @@ -97,6 +98,15 @@ class TestFilterableTableSource(
       override def isFilterPushedDown: Boolean = filterPushedDown
    +  override def getTableStats: TableStats = {
    --- End diff --
    If we let the `FilterableTableSource` compute the cardinality, this information will be
lost if the source table has valid stats registered in the `TableSourceTable`.

> using real row count instead of dummy row count when optimizing plan
> --------------------------------------------------------------------
>                 Key: FLINK-6516
>                 URL: https://issues.apache.org/jira/browse/FLINK-6516
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: godfrey he
>            Assignee: godfrey he
> Currently, the statistic of {{TableSourceTable}} is {{UNKNOWN}} mostly, and the statistic
from {{ExternalCatalog}} maybe is null also. Actually, only each {{TableSource}} knows its
statistic exactly, especial for {{FilterableTableSource}} and {{PartitionableTableSource}}.
So we can add {{getTableStats}} method in {{TableSource}}, and use it in TableSourceScan's
estimateRowCount method to get real row count.

This message was sent by Atlassian JIRA

View raw message