Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
  s=s1024; d=yahoo.com;
  h=X-YMail-OSG:Received:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type;
  b=QI/3qFG63sD1gDdXJR2+L6v2s49HhzqM+lCrRVE542jU4KYYJ5val8QSlb31taYbiG/SMToN0GsyyTFPXwZTgCU+u6qd979422UKCOG+KLmuDpHtMqC90isH6jE8gIvpSQvBKc/2jjA4KorU730jLlwvdWywPa/cbnq3VGwhNKg=;
References: 
 <CANRCFX2XD6zusqwuxCNvz6TdZLm8gKdKvmUYigNhDin6CoSkyw@mail.gmail.com>
 <1321485248.12530.YahooMailNeo@web121703.mail.ne1.yahoo.com>
 <CANRCFX0v9pq8czkeBxknkNx0aF2KbH4+PkCEUFmV8rvfDQhC9Q@mail.gmail.com>
Message-ID: <1321486616.53585.YahooMailNeo@web121717.mail.ne1.yahoo.com>
Date: Wed, 16 Nov 2011 15:36:56 -0800 (PST)
From: lars hofhansl <lhofhansl@yahoo.com>
Reply-To: lars hofhansl <lhofhansl@yahoo.com>
Subject: Re: Help with continuous loading configuration
To: "user@hbase.apache.org" <user@hbase.apache.org>
In-Reply-To: 
 <CANRCFX0v9pq8czkeBxknkNx0aF2KbH4+PkCEUFmV8rvfDQhC9Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
 boundary="-1185985161-876851504-1321486616=:53585"

---1185985161-876851504-1321486616=:53585
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

hbase.hstore.blockingStoreFiles is the maximum number of store files HBase =
will allow before=0Ait will block writes in order to catch up with compacti=
ng files. Default is 7.=0A=0AIf this is too low you'll see warning about bl=
ocking writers in the logs. I found that for some test load I had, I needed=
 to increase this 20=0Aalong with changing hbase.hregion.memstore.block.mul=
tiplier to 4 (this allows the memstore to grow larger, be careful with this=
 :) ).=0A=0A=0Ahbase.hstore.compactionThreshold is the number of store file=
s that will trigger a compaction. Changing this won't help with throughput.=
..=0A=0ABut I'll let somebody else jump in with more operational experience=
.=0A=0A=0A=0A________________________________=0AFrom: Amit Jain <jamit0574@=
gmail.com>=0ATo: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com>=
=0ASent: Wednesday, November 16, 2011 3:26 PM=0ASubject: Re: Help with cont=
inuous loading configuration=0A=0AHi Lars,=0A=0AThe keys are arriving in ra=
ndom order.=A0 The HBase monitoring page shows=0Aevenly distributed load ac=
ross all of the region servers.=A0 I didn't see=0Aanything weird in the gc =
logs, no mention of any failures.=A0 I'm a little=0Aunclear about what the =
optimal values for the following properties should=0Abe:=0A=0Ahbase.hstore.=
compactionThreshold=0Ahbase.hstore.blockingStoreFiles=0A=0AIs there some ru=
le of thumb that I can use to determine good values for=0Athese properties?=
=0A=0A- Amit=0A=0AOn Wed, Nov 16, 2011 at 3:14 PM, lars hofhansl <lhofhansl=
@yahoo.com> wrote:=0A=0A> Hi Amit,=0A>=0A> 12MB write buffer might be a bit=
 high.=0A>=0A> How are you generating your keys? You might hot spot a singl=
e region=0A> server if (for example) you create=0A> monotonically increasin=
g keys. When you look at the HBase monitoring page,=0A> do you see a single=
 region server=0A> getting all the requests?=0A>=0A>=0A> Anything weird in =
the GC logs? Do they all log similar?=0A>=0A>=0A> -- Lars=0A>=0A>=0A>=0A> _=
_______________________________=0A> From: Amit Jain <jamit0574@gmail.com>=
=0A> To: user@hbase.apache.org=0A> Sent: Wednesday, November 16, 2011 3:06 =
PM=0A> Subject: Help with continuous loading configuration=0A>=0A> Hello,=
=0A>=0A> We're doing a proof-of-concept study to see if HBase is a good fit=
 for an=0A> application we're planning to build.=A0 The application will be=
 recording a=0A> continuous stream of sensor data throughout the day and th=
e data needs to=0A> be online immediately.=A0 Our test cluster consists of =
16 machines, each with=0A> 16 cores and 32GB of RAM and 8TB local storage r=
unning CDH3u2.=A0 We're using=0A> the HBase client Put class, and have set =
the table "auto flush" to false=0A> and the write buffer size to 12MB.=A0 H=
ere are the region server JVM options:=0A>=0A> export HBASE_REGIONSERVER_OP=
TS=3D"-Xmx28g -Xms28g -Xmn128m -XX:+UseParNewGC=0A> -XX:+UseConcMarkSweepGC=
 -XX:CMSInitiatingOccupancyFraction=3D70 -verbose:gc=0A> -XX:+PrintGCDetail=
s -XX:+PrintGCTimeStamps=0A> -Xloggc:$HBASE_HOME/logs/gc-$(hostname)-hbase.=
log"=0A>=0A> And here are the property settings that we're using in the hba=
se-site.xml=0A> file:=0A>=0A> hbase.rootdir=3Dhdfs://master:9000/hbase=0A> =
hbase.regionserver.handler.count=3D20=0A> hbase.cluster.distributed=3Dtrue=
=0A> hbase.zookeeper.quorum=3Dzk01,zk02,zk03=0A> hfile.block.cache.size=3D0=
=0A> hbase.hregion.max.filesize=3D1073741824=0A> hbase.regionserver.global.=
memstore.upperLimit=3D0.79=0A> hbase.regionserver.global.memstore.lowerLimi=
t=3D0.70=0A> hbase.hregion.majorcompaction=3D0=0A> hbase.hstore.compactionT=
hreshold=3D15=0A> hbase.hstore.blockingStoreFiles=3D20=0A> hbase.rpc.timeou=
t=3D0=0A> zookeeper.session.timeout=3D3600000=0A>=0A> It's taking about 24 =
hours to load 4TB of data which isn't quite fast=0A> enough for our applica=
tion.=A0 Is there a more optimal configuration that we=0A> can use to impro=
ve loading performance?=0A>=0A> - Amit=0A>
---1185985161-876851504-1321486616=:53585--