flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: Compress DataSink Output
Date Fri, 19 Aug 2016 13:15:00 GMT
Hi Wes,

Flink's own OutputFormats don't support compression, but we have some tools
to use Hadoop's OutputFormats with Flink [1], and those support

Let me know if you need more information.



On Thu, Aug 18, 2016 at 2:13 AM, Wesley Kerr <wesley.n.kerr@gmail.com>

> Hello -
> Forgive me if this has been asked before, but I'm trying to determine the
> best way to add compression to DataSink Outputs (starting with
> TextOutputFormat).  Realistically I would like each partition file (based
> on parallelism) to be compressed independently with gzip, but am open to
> other solutions.
> My first thought was to extend TextOutputFormat with a new class that
> compresses after closing and before returning, but I'm not sure that would
> work across all possible file systems (S3, Local, and HDFS).
> Any thoughts?
> Thanks!
> Wes

View raw message