hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh Chouraria <ha...@cloudera.com>
Subject Re: Where does the compression take place for MapOutputStream in Map phase?
Date Tue, 05 Jul 2011 22:48:14 GMT
Hello Rui Hou,

If you look at the Writer constructor used here, you'll get your answer very easily. It takes
a codec (a compression codec, to be specific) as an argument. The codec, if not null (in case
compression is disabled), is then responsible for compressing the streams of data by wrapping
around the actual output stream.

The codec variable is initialized during the MapOutputStream construction accordingly.

The code for how codecs work can be read in the common code for the chosen algorithm, if you'd
like to take a look. For example, there's the DefaultCodec class.

I hope this helps! :)

P.s. Please do not cross post to multiple lists while seeking an answer. And for future mapreduce
development questions such as this, please direct it to mapreduce-dev@hadoop.apache.org

On 05-Jul-2011, at 7:50 PM, 侯锐 wrote:

> Hello guys, 
> We wonder to know where the compression take place for MapOutputStream in Map phase.
> 
> We guess there are two possible places in sortAndSpill() at MapTask.java:
> Writer.append() or Writer.close()
> Which one makes compression? 
> Appreciate very much for your response~
> 
> See lines marked by ****** as below (from sortAndSpill() at MapTask.java).
> 
> for (int i = 0; i < partitions; ++i) {
>          IFile.Writer<K, V> writer = null;
>          try {;
>            writer = new Writer<K, V>(job, out, keyClass, valClass, codec,
>                                      spilledRecordsCounter);
>            if (combinerRunner == null) {
>                 …
>                key.reset(kvbuffer, kvindices[kvoff + KEYSTART],
>                          (kvindices[kvoff + VALSTART] - 
>                           kvindices[kvoff + KEYSTART]));
>                /**************************************/
>                writer.append(key, value);   // The 1st possible place
>                ++spindex;
>              }
>            } else {
> …
>              }
>              …
> 
>            // close the writer
>            /**************************************/
>            writer.close();   // The 2st possible place
> 
> --
> Rui Hou (侯锐)
> Insititute of Technology, Chinese Academy of Sciences
> 
> 
> 
> 


Mime
View raw message