hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bharath Ravi <bharathra...@gmail.com>
Subject Improving HBase read performance (based on YCSB)
Date Tue, 14 Feb 2012 04:43:59 GMT
Hi all,

I have a distributed HBase setup, on which I'm running the
YCSB<https://github.com/brianfrankcooper/YCSB/wiki/running-a-workload>benchmark.
There are 5 region servers, each a Dual core with around 4GB of memory,
connected simply by a 1Gbps ethernet switch.

The number of "handlers" per regionserver is set to 500 (!) and HDFS's
maximum receivers per datanode is 4096.

The benchmark dataset is large enough not to fit in memory.
Update/Insert/Write throughput goes up to 8000 ops/sec easily.
However, I see read latencies in the order of seconds, and read throughputs
of only a few 100 ops per second.

"Top" tells me that the CPU's on regionservers spend 70-80% of their time
waiting for IO, while disk and network
have plenty of unused bandwidth. How could I diagnose where the read
bottleneck is?

Any help would be greatly appreciated :)

Thanks in advance!
-- 
Bharath Ravi

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message