flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (FLINK-5280) Extend TableSource to support nested data
Date Wed, 04 Jan 2017 02:19:58 GMT

    [ https://issues.apache.org/jira/browse/FLINK-5280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15796909#comment-15796909
] 

ASF GitHub Bot commented on FLINK-5280:
---------------------------------------

Github user wuchong commented on the issue:

    https://github.com/apache/flink/pull/3039
  
    What about extracting `getDataSet(ExecutionEnvironment)` and `getDataStream(StreamExecutionEnvironment)`
to interfaces that called like `DataSetGetter` and `DataStreamGetter`. 
    
    And we can make `BatchTableSource` extend `TableSource` abstract class and implement 
`DataSetGetter` interface. Make `StreamTableSource` extend `TableSource` abstract class and
implement  `DataStreamGetter` interface. And  make `BatchStreamTableSource` implement both
`DataSetGetter` and `DataStreamGetter`.  So that we can use `TableSource` plus `DataSetGetter`
where only `BatchTableSource` is expected. For example, the `BatchTableSourceScan` can be
changed to like this: 
    
    ```scala
    class BatchTableSourceScan(
        cluster: RelOptCluster,
        traitSet: RelTraitSet,
        table: RelOptTable,
        val tableSource: TableSource[_],
        val datasetGetter: DataSetGetter)
    ```
    
    Can this solve our problem ? 
    



> Extend TableSource to support nested data
> -----------------------------------------
>
>                 Key: FLINK-5280
>                 URL: https://issues.apache.org/jira/browse/FLINK-5280
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>    Affects Versions: 1.2.0
>            Reporter: Fabian Hueske
>            Assignee: Ivan Mushketyk
>
> The {{TableSource}} interface does currently only support the definition of flat rows.

> However, there are several storage formats for nested data that should be supported such
as Avro, Json, Parquet, and Orc. The Table API and SQL can also natively handle nested rows.
> The {{TableSource}} interface and the code to register table sources in Calcite's schema
need to be extended to support nested data.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message