hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Han JU <ju.han.fe...@gmail.com>
Subject question about combiner
Date Fri, 10 May 2013 15:19:41 GMT

For a MapReduce job with lots of intermediate results between mapper and
reducer, I implement a combiner function with a more compact representation
of the result data and I verified the final result is good when using
combiner. But when I look at the job counter "FILE_BYTES_WRITTEN" or
"Reduce shuffle bytes", the number with combiner is twice bigger than
without combiner. In my comprehension, these two counters represent the
output size of mapper. And with a combiner, the size of mapper output
should decrease, but it's not the case here.

So it means that my combiner doesn't work and it actually increase the size
of mapper output?

*JU Han*

Software Engineer Intern @ KXEN Inc.
UTC   -  Université de Technologie de Compiègne
*     **GI06 - Fouille de Données et Décisionnel*

+33 0619608888

View raw message