hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Oliver Meyn (GBIF)" <om...@gbif.org>
Subject Re: Scan performance on compressed column families
Date Fri, 09 Nov 2012 19:46:58 GMT
Hi David,

I wrote that blog post and I know that Lars George has much more experience than me with tuning
HBase, especially in different environments, so weight our opinions accordingly.  As he says,
it will "usually" help, and the unusual cases of lower spec'd hardware (that I did those tests
on) are where it might hurt scans, but obviously still helps with disk and network use.  So
take my post with a grain of salt, and as Kevin says, try it out on your data and your cluster
and see what works best for you.


On 2012-11-03, at 3:57 PM, David Koch wrote:

> Hello,
> Are scans faster when compression is activated? The HBase book by Lars
> George seems to suggest so (p424, Section on "Compression" in chapter
> "Performance Tuning").
> "... compression usually will yield overall better performance, because the
> overhead of the CPU performing the compression and de- compression is less
> than what is required to read more data from disk."
> I searched around for a bit and found this:
> http://gbif.blogspot.fr/2012/02/performance-evaluation-of-hbase.html. The
> author conducted a series of scan performance tests on tables of up to
> 200million rows and found that compression actually slowed down read
> performance slightly - albeit at lower CPU load.
> Thank you,
> /David

Oliver Meyn
Software Developer
Global Biodiversity Information Facility (GBIF)
+45 35 32 15 12

View raw message