hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From 侯锐 <hou...@ict.ac.cn>
Subject Where does the compression take place for MapOutputStream in Map phase?
Date Tue, 05 Jul 2011 14:20:17 GMT
Hello guys, 
We wonder to know where the compression take place for MapOutputStream in Map phase.

We guess there are two possible places in sortAndSpill() at MapTask.java:
Writer.append() or Writer.close()
Which one makes compression? 
Appreciate very much for your response~

See lines marked by ****** as below (from sortAndSpill() at MapTask.java).

for (int i = 0; i < partitions; ++i) {
          IFile.Writer<K, V> writer = null;
          try {;
            writer = new Writer<K, V>(job, out, keyClass, valClass, codec,
                                      spilledRecordsCounter);
            if (combinerRunner == null) {
                 …
                key.reset(kvbuffer, kvindices[kvoff + KEYSTART],
                          (kvindices[kvoff + VALSTART] - 
                           kvindices[kvoff + KEYSTART]));
                /**************************************/
                writer.append(key, value);   // The 1st possible place
                ++spindex;
              }
            } else {
…
              }
              …

            // close the writer
            /**************************************/
            writer.close();   // The 2st possible place

--
Rui Hou (侯锐)
Insititute of Technology, Chinese Academy of Sciences





Mime
View raw message