hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Owen O'Malley" <omal...@apache.org>
Subject Re: hive - snappy and sequence file vs RC file
Date Tue, 26 Jun 2012 16:49:03 GMT
SequenceFile compared to RCFile:
  * More widely deployed.
  * Available from MapReduce and Pig
  * Doesn't compress as small (in RCFile all of each columns values are put
together)
  * Uncompresses and deserializes all of the columns, even if you are only
reading a few

In either case, for long term storage, you should seriously consider the
default codec since that will provide much tighter compression (at the cost
of cpu to compress it).

-- Owen

Mime
View raw message