orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Deepak Majeti <majeti.dee...@gmail.com>
Subject Re: ORC contribution from Alibaba
Date Wed, 26 Apr 2017 15:08:29 GMT
Hi Gang and Xiening,

We at Vertica have been actively contributing and using the ORC C++ project
as well.
C++ writer will be a great addition to this project and we will look
forward to working with you in merging your contributions.

On Wed, Apr 26, 2017 at 2:13 AM, Gang Wu <gang.w@alibaba-inc.com> wrote:

> Hi,
> This is Gang from Alibaba working on Alibaba's big data platform -
> MaxCompute. We have developed our own columnar storage format within
> MaxCompute to support MapReduce and other batch processing workload. But as
> Apache Orc is getting popular in the industry, we are actively looking at
> integrating Orc format into MaxCompute.
> In the past few months, Xiening (cc'ed) and I have been working on
> echancing Orc C++ to provide full featured C++ reader and writer. Our work
> mainly involves adding a C++ writer that supports all data types and stats,
> and supporting index for both reader and writer. As of today, we have
> finished development and testing and plan to contribute this work back to
> the Apach Orc project. We have communicated with Owen via email and have
> created an umbrella JIRA ORC-179 for the plan. In brief, we plan to do the
> following:
>   1. Refactor common classes for writer and reader
>     -- extract common classes and functions for writer and reader to share
>   2. OutputStream interface for writer
>     -- implement several output streams for writing to memory, file, etc.
>     -- implement ByteRleEncoder, RleEncoder, BooleanRleEncoder, etc.
>     -- support zlib compression
>   3. ORC Writer
>     -- write orc file header, file footer, postscript, etc.
>     -- write columns of all types
>     -- write column statistics
>     -- write index stream in writer and reader seeks to
> row based on index information
>   4. other
>     -- some minor bug fixes of current code base.
> Should you have any question, please feel free to contact us. Any
> feedbacks and suggestions are welcome. Thanks!
> Gang WuSenior EngineerAlibaba Group

Deepak Majeti,
Software Engineer at Vertica

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message