hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom White <...@cloudera.com>
Subject Re: Reading GZIP input files.
Date Fri, 31 Jul 2009 16:41:43 GMT
That's for the case where you want to do the decompression yourself,
explicitly, perhaps when you are reading the data out of HDFS (and not
using MapReduce).  When using compressed data as input to a MapReduce
job, Hadoop will automatically decompress them for you.

Tom

On Fri, Jul 31, 2009 at 5:34 PM, David Been<davebeen@gmail.com> wrote:
> I'm new, reading Tom White's book, but there is an example using:
>
> CompressionCodecFactory factory = new CompressionCodecFactory(conf);
> CompressionCodec codec = factory.getCodec(inputPath); // infers from file ext.
> InputStream in = codec.createInputStream(fs.open(inputPath));
>
> On Fri, Jul 31, 2009 at 8:01 AM, prashant
> ullegaddi<prashullegaddi@gmail.com> wrote:
>> Hi guys,
>>
>> I have a set of 1000 gzipped plain text files. How to read them in Hadoop?
>> Is there any built-in class available for it?
>>
>> Btw, I'm using hadoop-0.18.3.
>>
>> Regards,
>> Prashant.
>>
>

Mime
View raw message