flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From uybhatti <...@git.apache.org>
Subject [GitHub] flink pull request #4670: [FLINK-2170] [connectors] Add ORC connector for Ta...
Date Thu, 14 Sep 2017 09:36:04 GMT
GitHub user uybhatti opened a pull request:

    https://github.com/apache/flink/pull/4670

    [FLINK-2170] [connectors] Add ORC connector for TableSource

    ## What is the purpose of the change
    Currently, we can't read data from ORC files. In this PR, we added the support to load
data from ORC files to Table Source.
    
    
    ## Brief change log
      - RowOrcInputFormat, OrcUtils and OrcTableSource classes implement the above functionality.
Also, OrcTableSource implement the ProjectableTableSource and FilterableTableSource interfaces.
      - For Optimisation, reading from ORC file is done in batch instead of single row at
a time.
    
    ## Verifying this change
    This change added tests and can be verified as follows:
      - RowOrcInputFormatTest to verify that reading for different datatypes, nested data
types and projection is correct.
      - OrcTableSourceTest and OrcTableSourceITCase are used to verify that loading of ORC
data into TableSource is correct.
      
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): **yes**
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`:  **no**
      - The serializers: (yes / no / don't know)
      - The runtime per-record code paths (performance sensitive): **no**
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing,
Yarn/Mesos, ZooKeeper: **no**
    
    ## Documentation
    
      - Does this pull request introduce a new feature? **yes**
      - If yes, how is the feature documented? **not documented**


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uybhatti/flink FLINK-2170

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/4670.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4670
    
----
commit 6629db4daf99960f0251948cb4569eb7c6efbade
Author: Fabian Hueske <fhueske@apache.org>
Date:   2017-03-03T22:55:22Z

    [FLINK-2170] [connectors] Add ORC connector for TableSource

----


---

Mime
View raw message