hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Dyer <psyb...@gmail.com>
Subject Re: Reg LZO compression
Date Wed, 17 Oct 2012 03:40:25 GMT
Hi Manoj,

If the data is the same for both tests and the number of mappers is
fewer, then each mapper has more (uncompressed) data to process.  Thus
each mapper should take longer and overall execution time should
increase.

As a simple example: if your data is 128MB uncompressed it may use 2
mappers, each processing 64MB of data (1 HDFS block per map task).
However, if you compress the data and it is now say 60MB, then one map
task will get the entire input file, decompress the data (to 128MB),
and process it.

On Tue, Oct 16, 2012 at 9:27 PM, Manoj Babu <manoj444@gmail.com> wrote:
> Hi All,
>
> When using lzo compression the file size drastically reduced and the no of
> mappers is reduced but the overall execution time is increased, I assume
> that because mappers deals with same amount of data.
>
> Is this the expected behavior?
>
> Cheers!
> Manoj.
>

Mime
View raw message