hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steinmaurer Thomas" <Thomas.Steinmau...@scch.at>
Subject RE: Compression
Date Wed, 14 Sep 2011 21:16:46 GMT

we ran various tests with our expected hbase schema and ended up with
snappy included in CDH3 update 1.

While GZ compressed data was half of LZO/Snappy compressed data, Snappy
showed much better performance than GZ and LZO in our tests. Compression
rate of LZO and Snappy is pretty much the same. LZO is a hassle
deployment wise, because it needs to be installed separately, Snappy is
a no-brainer with CDH3 update 1.


-----Original Message-----
From: Wayne [mailto:wav100@gmail.com] 
Sent: Mittwoch, 14. September 2011 14:34
To: user@hbase.apache.org
Subject: Compression

I wanted to do a poll on what compression libraries people are using and
why. We currently use lzo but are considering other alternatives for
various reasons. We would like to move to CDH3 but adding lzo ourselves
is a hassle we are not looking to take on. It kind of defeats the
purpose os using CDH3 to begin with. We current run 20.0 append.

I know there are a lot of variables that affect the best decision, but
we are looking for general trends in the community.

Is lzo still the most recommended? Is there benefit in using the lzo
professional library and does anyone use this?
Is snappy just as good as lzo and a lot easier to deal with in term of
node build/releases?
Does zlib/gzip have any traction?

Compression ratios are important but as always performance/speed is our
biggest requirement. What are people using and why? Where is the
momentum going? Compression is a huge benefit of hadoop/hbase and having
high compression ratios with solid performance is a major benefit.

Any recommendations would be appreciated.


View raw message