hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "CompressedStorage" by Evan
Date Thu, 10 Sep 2009 15:16:55 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by Evan:
http://wiki.apache.org/hadoop/CompressedStorage

------------------------------------------------------------------------------
  SET hive.exec.compress.output=TRUE; 
  SET io.seqfile.compression.type=BLOCK; -- NONE/RECORD/BLOCK (see below)
  INSERT OVERWRITE TABLE raw_sequnce SELECT LINE FROM raw;
+ 
+ INSERT OVERWRITE TABLE raw_sequnce SELECT * FROM raw; -- The previous line did not work
for me, but this does.
  }}}
  
  The value for io.seqfile.compression.type determines how the compression is performed. If
you set it to RECORD you will get as many output files as the number of map/reduce jobs. If
you set it to BLOCK, you will get as many output files as there were input files. There is
a tradeoff involved here -- large number of output files => more parellel map jobs =>
lower compression ratio.

Mime
View raw message