orc-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gang Wu <gan...@apache.org>
Subject Re: Add ZStandard compression to ORC -- ORC-306
Date Mon, 11 Feb 2019 22:57:56 GMT
Thanks David for providing your use cases.

Hi Owen, can we resume reviewing the aforementioned PR of ORC-363? Anyone
interested in reviewing this PR is welcome. Thanks!


On Fri, Feb 8, 2019 at 4:02 PM David Christle <dchristle@linkedin.com>

> Hi,
> I am interested in the status of pull request ORC-363 (
> https://github.com/apache/orc/pull/306), which adds the ZStandard
> compression codec to the Java reader/writer. I am very keen on
> experimenting with this codec for large scale data processing, and driving
> adoption of it to my colleagues, but I noticed that it seems to have
> stalled since the beginning of November waiting for review. As you know,
> ZStandard is a newer compression algorithm that offers essentially better
> compression than zlib at substantially faster speeds. It was recently
> enabled in the C++ writer/reader in ORC-395 (
> https://github.com/apache/orc/pull/301), but I don’t think this will work
> for using ZStandard within ORC in Apache Spark (my primary data processing
> framework).
> I do think this addition to ORC is a good one to shepherd through the
> review process, as I think it will be useful for anyone doing the kind of
> large scale data processing that ORC is designed to enable – Facebook has
> already implemented ZStandard in ORC, and recently reported double-digit
> improvements in both compression and speed (
> https://code.fb.com/core-data/zstandard/) in their data warehousing
> applications.
> Kind regards,
> David Christle

View raw message