hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@fb.com>
Subject RE: Converting byte[] to ByteBuffer
Date Mon, 11 Jul 2011 19:18:27 GMT
In my experience, CPU usage on HBase is very high for highly concurrent applications.  You
can expect the CMS GC to chew up 2-3 cores at sufficient throughput and the remaining cores
to be spent in CSLM/MemStore, KeyValue comparators, queues, etc.

> -----Original Message-----
> From: Jason Rutherglen [mailto:jason.rutherglen@gmail.com]
> Sent: Sunday, July 10, 2011 3:05 PM
> To: dev@hbase.apache.org
> Subject: Re: Converting byte[] to ByteBuffer
> 
> Ted,
> 
> Interesting.  I think we need to take a deeper look at why essentially turning
> off the caching of uncompressed blocks doesn't [seem to] matter.  My guess
> is it's cheaper to decompress on the fly than hog from the system IO cache
> with JVM heap usage.
> 
> Ie, CPU is cheaper than disk IO.
> 
> Further, (I asked this previously), where is the general CPU usage in HBase?
> Binary search on keys for seeking, skip list reads and writes, and [maybe]
> MapReduce jobs?  The rest should more or less be in the noise (or is general
> Java overhead).
> 
> I'd be curious to know the avg CPU consumption of an active HBase system.
> 
> On Sat, Jul 9, 2011 at 11:14 PM, Ted Dunning <tdunning@maprtech.com>
> wrote:
> > No.  The JNI is below the HDFS compatible API.  Thus the changed code
> > is in the hadoop.jar and associated jars and .so's that MapR supplies.
> >
> > The JNI still runs in the HBase memory image, though, so it can make
> > data available faster.
> >
> > The cache involved includes the cache of disk blocks (not HBase
> > memcache
> > blocks) in the JNI and in the filer sub-system.
> >
> > The detailed reasons why more caching in the file system and less in
> > HBase makes the overall system faster are not completely worked out,
> > but the general outlines are pretty clear.  There are likely several
> > factors at work in any case including less GC cost due to smaller
> > memory foot print, caching compressed blocks instead of Java
> > structures and simplification due to a clean memory hand-off with
> > associated strong demarcation of where different memory allocators have
> jurisdiction.
> >
> > On Sat, Jul 9, 2011 at 3:48 PM, Jason Rutherglen
> > <jason.rutherglen@gmail.com
> >> wrote:
> >
> >> I'm a little confused, I was told none of the HBase code changed with
> >> MapR, if the HBase (not the OS) block cache has a JNI implementation
> >> then that part of the HBase code changed.
> >> On Jul 9, 2011 11:19 AM, "Ted Dunning" <tdunning@maprtech.com>
> wrote:
> >> > MapR does help with the GC because it *does* have a JNI interface
> >> > into an external block cache.
> >> >
> >> > Typical configurations with MapR trim HBase down to the minimal
> >> > viable
> >> size
> >> > and increase the file system cache correspondingly.
> >> >
> >> > On Fri, Jul 8, 2011 at 7:52 PM, Jason Rutherglen <
> >> jason.rutherglen@gmail.com
> >> >> wrote:
> >> >
> >> >> MapR doesn't help with the GC issues. If MapR had a JNI interface
> >> >> into an external block cache then that'd be a different story. :)
> >> >> And I'm sure it's quite doable.
> >> >>
> >>
> >

Mime
View raw message