Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
Received-SPF: pass (athena.apache.org: domain of
 ramkrishna.s.vasudevan@gmail.com designates 209.85.223.172 as permitted
 sender)
MIME-Version: 1.0
In-Reply-To: <7AD9FC8F-0159-45C0-A7C6-13D0A1454936@gmail.com>
References: <7AD9FC8F-0159-45C0-A7C6-13D0A1454936@gmail.com>
Date: Thu, 20 Nov 2014 18:08:05 +0530
Message-ID: 
 <CAAT7Mkpu7QX9y8czjc2t_g2dZyFs3rZsWaKvXfcvB2O8b==deg@mail.gmail.com>
Subject: Re: YCSB load failed because hbase region too busy
From: ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com>
To: "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/related; boundary=90e6ba6e89b2976c610508499959

--90e6ba6e89b2976c610508499959
Content-Type: multipart/alternative; boundary=90e6ba6e89b2976c5d0508499958

--90e6ba6e89b2976c5d0508499958
Content-Type: text/plain; charset=UTF-8

Check if the writes are going to that particular region and its rate is too
high.  Ensure that the data gets distributed among all regions.
What is the memstore size?

If the rate of writes is very high then the flushing will get queued and
until the memstore gets flushed such that it goes down the global upper
limit writes will be blocked.

I don't have the code now to see the exact config related to memstore.

Regards
Ram

On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <louis.hust@gmail.com> wrote:

> hi all,
>
> I build an HBASE test environment, with three PC server, with CHD 5.1.0
>
> pc1 pc2 pc3
>
> pc1 and pc2 as HMASTER and hadoop namenode
> pc3 as RegionServer and datanode
>
> Then I create user as following:
>
> create 'usertable', 'family', {SPLITS => (1..100).map {|i| "user#{1000+i*(9999-1000)/100}"} }
>
> Using YCSB for load data as following:
>
> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p columnfamily=family
> -p recordcount=1000000000   -p threadcount=32  -s  > result/workloadc
>
>
> But when after a while, the ycsb return with following error:
>
> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
> attempt=35/35 failed 715 ops, last exception:
> org.apache.hadoop.hbase.RegionTooBusyException:
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
> server=l-hbase10.dba.cn1 <http://dba.cn1.qunar.com>,60020,1416451280772,
> memstoreSize=536897120, blockingMemStoreSize=536870912
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>         at java.lang.Thread.run(Thread.java:744)
>  on l-hbase10.dba.cn1 <http://dba.cn1.qunar.com>,60020,1416451280772,
> tracking started Thu Nov 20 12:15:07 CST 2014, retrying after 20051 ms,
> replay 715 ops.
>
>
> It seems the user9099 region is too busy, so I lookup the memstore metrics
> in web:
>
> As you see, the user9099 is bigger than other region, I think it is
> flushing, but after a while, it does not change to a small size and YCSB
> quit finally.
>
> But when i change the concurrency threads to 4, all is right. I want to
> know why?
>
> Any idea will be appreciated.
>
>
>

--90e6ba6e89b2976c5d0508499958
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Check if the writes are going to that particular region an=
d its rate is too high.=C2=A0 Ensure that the data gets distributed among a=
ll regions.<div>What is the memstore size? =C2=A0</div><div><br></div><div>=
If the rate of writes is very high then the flushing will get queued and un=
til the memstore gets flushed such that it goes down the global upper limit=
 writes will be blocked.</div><div><br></div><div>I don&#39;t have the code=
 now to see the exact config related to memstore.</div><div><br></div><div>=
Regards</div><div>Ram</div></div><div class=3D"gmail_extra"><br><div class=
=3D"gmail_quote">On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <span dir=3D"l=
tr">&lt;<a href=3D"mailto:louis.hust@gmail.com" target=3D"_blank">louis.hus=
t@gmail.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" styl=
e=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div st=
yle=3D"word-wrap:break-word">hi all,<div><br></div><div><div>I build an HBA=
SE test environment, with three PC server, with CHD 5.1.0</div><div><br></d=
iv><div>pc1 pc2 pc3</div><div><br></div><div>pc1 and pc2 as HMASTER and had=
oop namenode</div><div>pc3 as RegionServer and datanode</div></div><div><br=
></div><div>Then I create user as following:</div><div><pre style=3D"font-s=
ize:medium;padding:0px;max-height:30em;overflow:auto;word-wrap:normal;backg=
round-color:rgb(245,245,245)"><font color=3D"#333333"><span style=3D"line-h=
eight:16px;white-space:pre-wrap">create &#39;usertable&#39;, &#39;family=
9;, {SPLITS =3D&gt; (1..100).map {|i| </span></font><span style=3D"color:rg=
b(0,145,0);font-size:12px;line-height:16px;white-space:pre-wrap">&quot;user=
#{1000+i*(9999-1000)/100}&quot;</span><font color=3D"#333333"><span style=
=3D"line-height:16px;white-space:pre-wrap">} }</span></font></pre></div><di=
v><div>Using YCSB for load data as following:</div><div><br></div><div><spa=
n style=3D"font-family:Arial;font-size:medium">./bin/ycsb=C2=A0 load=C2=A0 =
hbase=C2=A0=C2=A0 -P workloads/workloadc =C2=A0-p columnfamily=3Dfamily -p =
recordcount=3D1000000000 =C2=A0 -p threadcount=3D32 =C2=A0-s=C2=A0 &gt; res=
ult/workloadc</span></div><br><br><div>But when after a while, the ycsb ret=
urn with following error:</div><div><br></div><div><div>14/11/20 12:23:44 I=
NFO client.AsyncProcess: #15, table=3Dusertable, attempt=3D35/35 failed 715=
 ops, last exception: org.apache.hadoop.hbase.RegionTooBusyException: org.a=
pache.hadoop.hbase.RegionTooBusyException: Above memstore limit, regionName=
=3D<font color=3D"#e32400">usertable,user9099,</font>1416453519676.2552d36e=
b407a8af12d2b58c973d68a9., server=3Dl-hbase10.<a href=3D"http://dba.cn1.qun=
ar.com" target=3D"_blank">dba.cn1</a>,60020,1416451280772, memstoreSize=3D5=
36897120, blockingMemStoreSize=3D536870912</div><div>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 at org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegi=
on.java:2822)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hb=
ase.regionserver.HRegion.batchMutate(HRegion.java:2234)</div><div>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hbase.regionserver.HRegion.batchM=
utate(HRegion.java:2201)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apach=
e.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)</div><di=
v>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hbase.regionserver.HRegi=
onServer.doBatchOp(HRegionServer.java:4253)</div><div>=C2=A0 =C2=A0 =C2=A0 =
=C2=A0 at org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicReg=
ionMutation(HRegionServer.java:3469)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =
at org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.j=
ava:3359)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hbase.=
protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientPr=
otos.java:29503)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop=
.hbase.ipc.RpcServer.call(RpcServer.java:2012)</div><div>=C2=A0 =C2=A0 =C2=
=A0 =C2=A0 at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98=
)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hbase.ipc.Simp=
leRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)</div><div>=C2=A0 =
=C2=A0 =C2=A0 =C2=A0 at org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.acce=
ss$000(SimpleRpcScheduler.java:38)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at=
 org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.ja=
va:110)</div><div>=C2=A0 =C2=A0 =C2=A0 =C2=A0 at java.lang.Thread.run(Threa=
d.java:744)</div><div>=C2=A0on l-hbase10.<a href=3D"http://dba.cn1.qunar.co=
m" target=3D"_blank">dba.cn1</a>,60020,1416451280772, tracking started Thu =
Nov 20 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.</div></d=
iv></div><div><br></div><div><br></div><div>It seems the user9099 region is=
 too busy, so I lookup the memstore metrics in web:</div><div><img height=
=3D"250" width=3D"1262" src=3D"cid:A8565D2F-E892-4488-8FCB-F2F891910A76@qun=
arservers.com"></div><div><br></div><div>As you see, the user9099 is bigger=
 than other region, I think it is flushing, but after a while, it does not =
change to a small size and YCSB quit finally.</div><div><br></div><div>But =
when i change the concurrency threads to 4, all is right. I want to know wh=
y?</div><div><br></div><div>Any idea will be appreciated.</div><div><br></d=
iv><div><br></div></div></blockquote></div><br></div>

--90e6ba6e89b2976c5d0508499958--

--90e6ba6e89b2976c610508499959--