flink-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fabian Hueske (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (FLINK-7214) Add a sink that writes to ORCFile on HDFS
Date Mon, 17 Jul 2017 17:09:00 GMT

     [ https://issues.apache.org/jira/browse/FLINK-7214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Fabian Hueske updated FLINK-7214:
    Component/s:     (was: Batch Connectors and Input/Output Formats)
                 Streaming Connectors

> Add a sink that writes to ORCFile on HDFS
> -----------------------------------------
>                 Key: FLINK-7214
>                 URL: https://issues.apache.org/jira/browse/FLINK-7214
>             Project: Flink
>          Issue Type: New Feature
>          Components: Streaming Connectors
>            Reporter: Robert Rapplean
>            Priority: Minor
>              Labels: features, hdfssink, orcfile
> ORCFile format is currently one of the most efficient storage formats on HDFS from both
the storage and search speed perspective, and it's a well supported standard.
> This feature would receive an input stream, map its columns to the columns in a Hive
table, and write it to HDFS in ORC format. It would need to support hive bucketing and dynamic
hive partitioning, and generate the appropriate metadata in the Hive database.

This message was sent by Atlassian JIRA

View raw message