hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <la...@apache.org>
Subject Re: Does compression ever improve performance?
Date Sun, 15 Jun 2014 22:55:51 GMT
Unfortunately it's not quite that simple.
Currently the HBase scanning guts expect all KeyValues to be laid out in memory in a continuous
way, so with encoding they need to be copied in memory to make it... We're working on fixing
it, but this is currently the way it is.

So on the one hand you fit more data into the block cache (which is unlike compression, where
the data is uncompressed before the blocks get cached), but on the other hand much more garbage
is produced during scanning and more CPU and memory bandwidth is used. So you need to test
for your use case.

-- Lars



________________________________
 From: Ted Yu <yuzhihong@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org> 
Sent: Sunday, June 15, 2014 3:46 PM
Subject: Re: Does compression ever improve performance?
 

Data block encoding enables block cache to hold more entries, thereby
lifting performance.

You can find coverage of data block encoding in this wiki as well:
https://blogs.apache.org/hbase/

Cheers





On Sun, Jun 15, 2014 at 2:00 PM, Tom Brown <tombrown52@gmail.com> wrote:

> I don't mean to hijack the thread, but this question seems relevant:
>
> Does data block encoding also help performance, or does it just enable more
> efficient compression?
>
> --Tom
>
> On Saturday, June 14, 2014, Guillermo Ortiz <konstt2000@gmail.com> wrote:
>
> > I would like to see the times they got doing some scans or get with the
> > benchmark about compression and block code to figure out how much time to
> > save if your data are smaller but you have to decompress them.
> >
> > El sábado, 14 de junio de 2014, Kevin O'dell <kevin.odell@cloudera.com
> > <javascript:;>>
> > escribió:
> >
> > > Hi Jeremy,
> > >
> > >   I always recommend turning on snappy compression,  I have ~20%
> > > performance increases.
> > > On Jun 14, 2014 10:25 AM, "Ted Yu" <yuzhihong@gmail.com <javascript:;>
> > <javascript:;>>
> > > wrote:
> > >
> > > > You may have read Doug Meil's writeup where he tried out different
> > > > ColumnFamily
> > > > compressions :
> > > >
> > > > https://blogs.apache.org/hbase/
> > > >
> > > > Cheers
> > > >
> > > >
> > > > On Fri, Jun 13, 2014 at 11:33 AM, jeremy p <
> > > athomewithagroovebox@gmail.com <javascript:;> <javascript:;>
> > > > >
> > > > wrote:
> > > >
> > > > > Thank you -- I'll go ahead and try compression.
> > > > >
> > > > > --Jeremy
> > > > >
> > > > >
> > > > > On Fri, Jun 13, 2014 at 10:59 AM, Dima Spivak <
> dspivak@cloudera.com
> > <javascript:;>
> > > <javascript:;>>
> > > > > wrote:
> > > > >
> > > > > > I'd highly recommend it. In general, compressing your column
> > families
> > > > > will
> > > > > > improve performance by reducing the resources required to get
> data
> > > from
> > > > > > disk (even when taking into account the CPU overhead of
> compressing
> > > and
> > > > > > decompressing).
> > > > > >
> > > > > > -Dima
> > > > > >
> > > > > >
> > > > > > On Fri, Jun 13, 2014 at 10:35 AM, jeremy p <
> > > > > athomewithagroovebox@gmail.com <javascript:;> <javascript:;>
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hey all,
> > > > > > >
> > > > > > > Right now, I'm not using compression on any of my tables,
> because
> > > our
> > > > > > data
> > > > > > > doesn't take up a huge amount of space.  However, I would
turn
> on
> > > > > > > compression if there was a chance it would improve HBase's
> > > > performance.
> > > > > >  By
> > > > > > > performance, I'm talking about the speed with which HBase
> > responds
> > > to
> > > > > > > requests and retrieves data.
> > > > > > >
> > > > > > > Should I turn compression on?
> > > > > > >
> > > > > > > --Jeremy
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message