hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Yu <yuzhih...@gmail.com>
Subject Re: Low CPU usage and slow reads in pseudo-distributed mode - how to fix?
Date Sun, 11 Jan 2015 00:19:33 GMT
Please see http://hbase.apache.org/book.html#perf.reading

I guess you use 0.90.4 because of Nutch integration. Still 0.90.x was way
too old.

bq. HBase has a heapsize of 1.5 Gigs

This is not enough memory for good read performance. Please consider giving
HBase more heap.


On Sat, Jan 10, 2015 at 4:04 PM, Dave Benson <davehbenson@gmail.com> wrote:

> Hi HBase users,
> I'm working HBase for the first time and I'm trying to sort out a
> performance issue. HBase is the data store for a small, focused web crawl
> I'm performing with Apache Nutch. I'm running in pseudo-distributed mode,
> meaning that Nutch, HBase and Hadoop are all on the same machine. The
> machine's a few years old and has only 4 gigs of RAM - much smaller than
> most HBase installs, I know.
> When I first start my HBase processes I get about 60 seconds of fast
> performance. Hbase reads quickly and uses a healthy portion CPU cycles.
> After a minute or so, though, HBase slows dramatically. Reads sink to a
> glacial pace, and the CPU sits mostly idle.
> I notice this pattern when I run Nutch - particularly during read-heavy
> operations - but also when I run a simple row counter from the shell.
> At the moment " count 'my_table' " takes almost 4 hours to read through 500
> 000 rows. The reading is much faster at the start than the end.  In the
> first 30 seconds, HBase counts 37000 rows, but in the 30 seconds between
> 8:00 and 8:30, only 1000 are counted.
> Looking through my Ganglia report I see a brief return to high performance
> around 3 hours into the count. I don't know what's causing this spike.
> Can anyone suggest what configuration parameters I should change to improve
> read performance?  Or what reference materials I should consult to better
> understand the problem?  Again, I'm totally new to HBase.
> I'm using HBase 0.90.4 and Hadoop 1.2.2. HBase has a heapsize of 1.5 Gigs.
> Here's a Ganglia report covering the 4 hours of " count 'my_table' ":
> http://imgur.com/Aa3eukZ
> Please let me know if I can provide any more information.
> Many thanks,
> Dave

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message