hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Help with continuous loading configuration
Date Wed, 16 Nov 2011 23:36:56 GMT
hbase.hstore.blockingStoreFiles is the maximum number of store files HBase will allow before
it will block writes in order to catch up with compacting files. Default is 7.

If this is too low you'll see warning about blocking writers in the logs. I found that for
some test load I had, I needed to increase this 20
along with changing hbase.hregion.memstore.block.multiplier to 4 (this allows the memstore
to grow larger, be careful with this :) ).

hbase.hstore.compactionThreshold is the number of store files that will trigger a compaction.
Changing this won't help with throughput...

But I'll let somebody else jump in with more operational experience.

From: Amit Jain <jamit0574@gmail.com>
To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>
Sent: Wednesday, November 16, 2011 3:26 PM
Subject: Re: Help with continuous loading configuration

Hi Lars,

The keys are arriving in random order.  The HBase monitoring page shows
evenly distributed load across all of the region servers.  I didn't see
anything weird in the gc logs, no mention of any failures.  I'm a little
unclear about what the optimal values for the following properties should


Is there some rule of thumb that I can use to determine good values for
these properties?

- Amit

On Wed, Nov 16, 2011 at 3:14 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Hi Amit,
> 12MB write buffer might be a bit high.
> How are you generating your keys? You might hot spot a single region
> server if (for example) you create
> monotonically increasing keys. When you look at the HBase monitoring page,
> do you see a single region server
> getting all the requests?
> Anything weird in the GC logs? Do they all log similar?
> -- Lars
> ________________________________
> From: Amit Jain <jamit0574@gmail.com>
> To: user@hbase.apache.org
> Sent: Wednesday, November 16, 2011 3:06 PM
> Subject: Help with continuous loading configuration
> Hello,
> We're doing a proof-of-concept study to see if HBase is a good fit for an
> application we're planning to build.  The application will be recording a
> continuous stream of sensor data throughout the day and the data needs to
> be online immediately.  Our test cluster consists of 16 machines, each with
> 16 cores and 32GB of RAM and 8TB local storage running CDH3u2.  We're using
> the HBase client Put class, and have set the table "auto flush" to false
> and the write buffer size to 12MB.  Here are the region server JVM options:
> export HBASE_REGIONSERVER_OPTS="-Xmx28g -Xms28g -Xmn128m -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
> And here are the property settings that we're using in the hbase-site.xml
> file:
> hbase.rootdir=hdfs://master:9000/hbase
> hbase.regionserver.handler.count=20
> hbase.cluster.distributed=true
> hbase.zookeeper.quorum=zk01,zk02,zk03
> hfile.block.cache.size=0
> hbase.hregion.max.filesize=1073741824
> hbase.regionserver.global.memstore.upperLimit=0.79
> hbase.regionserver.global.memstore.lowerLimit=0.70
> hbase.hregion.majorcompaction=0
> hbase.hstore.compactionThreshold=15
> hbase.hstore.blockingStoreFiles=20
> hbase.rpc.timeout=0
> zookeeper.session.timeout=3600000
> It's taking about 24 hours to load 4TB of data which isn't quite fast
> enough for our application.  Is there a more optimal configuration that we
> can use to improve loading performance?
> - Amit
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message