hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Update of "UsingLzoCompression" by RyanRawson
Date Wed, 06 May 2009 20:18:32 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by RyanRawson:
http://wiki.apache.org/hadoop/UsingLzoCompression

------------------------------------------------------------------------------
  By enabling compression, the store file (HFile) will use a compression algorithm on blocks
as they are written (during flushes and compactions) and thus must be decompressed when reading.
  
  Since this adds a read-time-penalty, why would one enable any compression?  There are a
few reasons why the advantages of compression can outweigh the disadvantages:
- * Compression reduces the number of bytes written to/read from HDFS
+  * Compression reduces the number of bytes written to/read from HDFS
- * Compression effectively improves the efficiency of network bandwidth and disk space
+  * Compression effectively improves the efficiency of network bandwidth and disk space
- * Compression reduces the size of data needed to be read when issuing a read
+  * Compression reduces the size of data needed to be read when issuing a read
  
  To be as low friction as necessary, a real-time compression library is preferred.  Out of
the box, HBase ships with only Gzip compression, which is fairly slow. 
  
@@ -22, +22 @@

  Lzo is a GPL'ed native-library that ships with most Linux distributions.  However, to use
it in HBase, one must do the following steps:
  
  Ensure the native Lzo base library is available on every node:
- * on Ubuntu: apt-get install liblzo2-dev
+  * on Ubuntu: apt-get install liblzo2-dev
- * or Download and build [http://www.oberhumer.com/opensource/lzo/]
+  * or Download and build [http://www.oberhumer.com/opensource/lzo/]
  
  Download/patch the native connector library:
- * Download/checkout: [http://code.google.com/p/hadoop-gpl-compression/]
+  * Download/checkout: [http://code.google.com/p/hadoop-gpl-compression/]
- * Apply the patch attached to this issue: [http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=6]
+  * Apply the patch attached to this issue: [http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=6]
  * On Linux you may need to apply the patch: [http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=5]
  * On Mac you may be interested in: [http://code.google.com/p/hadoop-gpl-compression/issues/detail?id=7]
  ** Also you will probably have to add the line to build.xml just above the call to 'configure'
in compile-native:

Mime
View raw message