hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Help with continuous loading configuration
Date Wed, 16 Nov 2011 23:14:08 GMT
Hi Amit,

12MB write buffer might be a bit high.

How are you generating your keys? You might hot spot a single region server if (for example)
you create
monotonically increasing keys. When you look at the HBase monitoring page, do you see a single
region server
getting all the requests?


Anything weird in the GC logs? Do they all log similar?


-- Lars



________________________________
From: Amit Jain <jamit0574@gmail.com>
To: user@hbase.apache.org
Sent: Wednesday, November 16, 2011 3:06 PM
Subject: Help with continuous loading configuration

Hello,

We're doing a proof-of-concept study to see if HBase is a good fit for an
application we're planning to build.  The application will be recording a
continuous stream of sensor data throughout the day and the data needs to
be online immediately.  Our test cluster consists of 16 machines, each with
16 cores and 32GB of RAM and 8TB local storage running CDH3u2.  We're using
the HBase client Put class, and have set the table "auto flush" to false
and the write buffer size to 12MB.  Here are the region server JVM options:

export HBASE_REGIONSERVER_OPTS="-Xmx28g -Xms28g -Xmn128m -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
-Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"

And here are the property settings that we're using in the hbase-site.xml
file:

hbase.rootdir=hdfs://master:9000/hbase
hbase.regionserver.handler.count=20
hbase.cluster.distributed=true
hbase.zookeeper.quorum=zk01,zk02,zk03
hfile.block.cache.size=0
hbase.hregion.max.filesize=1073741824
hbase.regionserver.global.memstore.upperLimit=0.79
hbase.regionserver.global.memstore.lowerLimit=0.70
hbase.hregion.majorcompaction=0
hbase.hstore.compactionThreshold=15
hbase.hstore.blockingStoreFiles=20
hbase.rpc.timeout=0
zookeeper.session.timeout=3600000

It's taking about 24 hours to load 4TB of data which isn't quite fast
enough for our application.  Is there a more optimal configuration that we
can use to improve loading performance?

- Amit
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message