flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Metzger <rmetz...@apache.org>
Subject Re: read .gz files
Date Thu, 19 Feb 2015 20:36:13 GMT
I just had a look at Hadoop's TextInputFormat.
In hadoop-common-2.2.0.jar there are the following compression codecs
contained:

org.apache.hadoop.io.compress.BZip2Codec
org.apache.hadoop.io.compress.DefaultCodec
org.apache.hadoop.io.compress.DeflateCodec
org.apache.hadoop.io.compress.GzipCodec
org.apache.hadoop.io.compress.Lz4Codec
org.apache.hadoop.io.compress.SnappyCodec

(See also CompressionCodecFactory). So you should be good to go.


On Thu, Feb 19, 2015 at 9:31 PM, Robert Metzger <rmetzger@apache.org> wrote:

> Hi,
>
> right now Flink itself has only support for reading ".deflate" files. Its
> basically the same algorithm as gzip but gzip files seem to have some
> header which makes the two formats incompatible.
>
> But you can easily use HadoopInputFormats with Flink. I'm sure there is a
> Hadoop IF for reading gzip'ed files.
>
>
> Best,
> Robert
>
>
> On Thu, Feb 19, 2015 at 9:25 PM, Sebastian <ssc.open@googlemail.com>
> wrote:
>
>> Hi,
>>
>> does flink support reading gzipped files? Haven't found any info about
>> this on the website.
>>
>> Best,
>> Sebastian
>>
>
>

Mime
View raw message