hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ricardo Vilaça <rmvil...@di.uminho.pt>
Subject HBase Tuning
Date Wed, 10 Oct 2012 12:51:45 GMT

I'm doing some experiments with HBase 0.92 and Hadoop
1.0.1. We have a small cluster with dual core machines with 8 GB of RAM.
The cluster has: 1 node running a NameNode and  HBase master;
1 node running Zookeeper; and  20 nodes running RegionServer co-located
with DataNode.

The application has 8 tables and all of them are partitioned in regions,
resulting in a total of 164 regions in the cluster, and they are evenly
distributed across all RegionServers, 8 regions per RegionServer, less
that 1.5GB of data.

We had already done some configuration tuning to HDFS and HBase with the
main parameters being:
    * hbase.regionserver.handler.count=100
    * hfile.block.cache.size=0.5
    * hbase.regionserver.global.memstore.upperLimit=0.15
    * hbase.regionserver.global.memstore.lowerLimit=0.1
    * hbase.hregion.memstore.mslab.enabled=true
    * HBASE_REGIONSERVER_OPTS="-Xmx6546m -Xms4046m -Xmn128m
-XX:+UseParNewGC -XX:+UseConcMarkSweepGC

    * dfs.replication=3
    * dfs.datanode.max.xcievers=16384
    * dfs.datanode.handler.count=4

Clients are running on quad-core machines also with 8GB of RAM. The
application has several clients (each running in a thread).
A single client node is available to handle 400 clients with linear
throughout and acceptable performance (below 1.5 seconds)
for application operations (involving several HBase operations). In
detail the mix of involved HBase operations
per second is as follows:
    * 480 scans with an average size of 45
    * 88 single row puts
    * 51 batch gets with average size 100.
    * 55 deletes
    * 830 single row gets

With this configuration the RegionServers has no IO wait, the
blockCacheHitRatio is almost 100%, hdfsBlocksLocalityIndex is 100,
network usage is low, and the CPU in all RegionServers is almost idle,
more than 90%.

However, when adding an additional client node, with also 400 clients,
the latency increases 3 times,
but the RegionServers remains idle more than 80%. I had tried different
values for the hbase.regionserver.handler.count and also
for the hbase.client.ipc.pool size and type but without any improvement.

Is there any configuration parameter that can improve the latency with
several concurrent threads and more than one HBase client node
and/or which JMX parameters should I monitor on RegionServers to check
what may be causing this and how could I achieve better utilization of CPU
at RegionServers?


Ricardo Vilaça

High-Assurance Software Lab
INESC TEC & Universidade do Minho

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message