hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hive/CompressedStorage" by Peter Voss
Date Fri, 07 May 2010 08:48:11 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/CompressedStorage" page has been changed by Peter Voss.
http://wiki.apache.org/hadoop/Hive/CompressedStorage?action=diff&rev1=5&rev2=6

--------------------------------------------------

  
  SET hive.exec.compress.output=true; 
  SET io.seqfile.compression.type=BLOCK; -- NONE/RECORD/BLOCK (see below)
- INSERT OVERWRITE TABLE raw_sequnce SELECT LINE FROM raw;
+ INSERT OVERWRITE TABLE raw_sequence SELECT LINE FROM raw;
  
- INSERT OVERWRITE TABLE raw_sequnce SELECT * FROM raw; -- The previous line did not work
for me, but this does.
+ INSERT OVERWRITE TABLE raw_sequence SELECT * FROM raw; -- The previous line did not work
for me, but this does.
  }}}
  
  The value for io.seqfile.compression.type determines how the compression is performed. If
you set it to RECORD you will get as many output files as the number of map/reduce jobs. If
you set it to BLOCK, you will get as many output files as there were input files. There is
a tradeoff involved here -- large number of output files => more parellel map jobs =>
lower compression ratio.

Mime
View raw message