hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niels Basjes <Ni...@basjes.nl>
Subject Re: Doubts on compressed file
Date Wed, 07 Nov 2012 12:47:22 GMT

> If a zip file(Gzip) is loaded into HDFS will it get splitted into Blocks and
> store in HDFS?


> I understand that a single mapper can work with GZip as it reads the entire
> file from beginning to end... In that case if the GZip file size is larget
> than 128 MB will it get splitted into blocks and stored in HDFS?

Yes, and then the mapper will read the other parts of the file over the network.
So what I do is I upload such files with a bigger HDFS blocksize so
the mapper has "the entire file" locally.

Best regards / Met vriendelijke groeten,

Niels Basjes

View raw message