hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amit Jain <jamit0...@gmail.com>
Subject Re: Help with continuous loading configuration
Date Wed, 16 Nov 2011 23:26:11 GMT
Hi Lars,

The keys are arriving in random order.  The HBase monitoring page shows
evenly distributed load across all of the region servers.  I didn't see
anything weird in the gc logs, no mention of any failures.  I'm a little
unclear about what the optimal values for the following properties should
be:

hbase.hstore.compactionThreshold
hbase.hstore.blockingStoreFiles

Is there some rule of thumb that I can use to determine good values for
these properties?

- Amit

On Wed, Nov 16, 2011 at 3:14 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

> Hi Amit,
>
> 12MB write buffer might be a bit high.
>
> How are you generating your keys? You might hot spot a single region
> server if (for example) you create
> monotonically increasing keys. When you look at the HBase monitoring page,
> do you see a single region server
> getting all the requests?
>
>
> Anything weird in the GC logs? Do they all log similar?
>
>
> -- Lars
>
>
>
> ________________________________
> From: Amit Jain <jamit0574@gmail.com>
> To: user@hbase.apache.org
> Sent: Wednesday, November 16, 2011 3:06 PM
> Subject: Help with continuous loading configuration
>
> Hello,
>
> We're doing a proof-of-concept study to see if HBase is a good fit for an
> application we're planning to build.  The application will be recording a
> continuous stream of sensor data throughout the day and the data needs to
> be online immediately.  Our test cluster consists of 16 machines, each with
> 16 cores and 32GB of RAM and 8TB local storage running CDH3u2.  We're using
> the HBase client Put class, and have set the table "auto flush" to false
> and the write buffer size to 12MB.  Here are the region server JVM options:
>
> export HBASE_REGIONSERVER_OPTS="-Xmx28g -Xms28g -Xmn128m -XX:+UseParNewGC
> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -verbose:gc
> -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
> -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.log"
>
> And here are the property settings that we're using in the hbase-site.xml
> file:
>
> hbase.rootdir=hdfs://master:9000/hbase
> hbase.regionserver.handler.count=20
> hbase.cluster.distributed=true
> hbase.zookeeper.quorum=zk01,zk02,zk03
> hfile.block.cache.size=0
> hbase.hregion.max.filesize=1073741824
> hbase.regionserver.global.memstore.upperLimit=0.79
> hbase.regionserver.global.memstore.lowerLimit=0.70
> hbase.hregion.majorcompaction=0
> hbase.hstore.compactionThreshold=15
> hbase.hstore.blockingStoreFiles=20
> hbase.rpc.timeout=0
> zookeeper.session.timeout=3600000
>
> It's taking about 24 hours to load 4TB of data which isn't quite fast
> enough for our application.  Is there a more optimal configuration that we
> can use to improve loading performance?
>
> - Amit
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message