hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "louis.hust" <louis.h...@gmail.com>
Subject Re: YCSB load failed because hbase region too busy
Date Tue, 25 Nov 2014 07:00:09 GMT
hi,Ram

After i modify the  hbase.hstore.flusher.count, it just improve the load, but after one hour
, the YCSB 
load program is still blocked! Then I change hbase.hstore.flusher.count to 40, but it’s
the same as 20,

On Nov 25, 2014, at 14:47, ramkrishna vasudevan <ramkrishna.s.vasudevan@gmail.com> wrote:

>>> hbase.hstore.flusher.count to 20 (default value is 2), and run the YCSB
> to load data
> with 32 threads
> 
> Apologies for the late reply. Your change of configuraton from 2 to 20 is
> right in this case because you are data ingest rate is high I suppose.
> 
> Thanks for the reply.
> 
> Regards
> Ram
> 
> On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <louis.hust@gmail.com> wrote:
> 
>> hi, all
>> 
>> I retest the YCSB load data, and here is a situation which may explain the
>> load data blocked.
>> 
>> I use too many threads to insert values, so the flush thread is not
>> effectively to handle all memstore,
>> and the user9099 memstore is queued at last, and waiting for flush too
>> long which blocks the YCSB request.
>> 
>> Then I modify the configuration, set hbase.hstore.flusher.count to 20
>> (default value is 2), and run the YCSB to load data
>> with 32 threads, it can run for 1 hour (with 2 threads just run for less
>> than half 1 hour).
>> 
>> 
>> On Nov 20, 2014, at 23:20, louis.hust <louis.hust@gmail.com> wrote:
>> 
>>> Hi Ram,
>>> 
>>> Thanks for your reply!
>>> 
>>> I use YCSB workloadc to load data, and from the web request monitor i
>> can see that
>>> the write requests are distributed among all regions, so i think the
>> data get distributed,
>>> 
>>> And there are 32 thread writing to the region server, may be the
>> concurrency and write rate is too high.
>>> The writes are blocked but the memstore do not get flushed, i want to
>> know why?
>>> 
>>> The jvm heap is 64G and hbase.regionserver.global.memstore.size is
>> default(0.4) about 25.6G,
>>> and hbase.hregion.memstore.flush.size is default(132M),  but the blocked
>> memstore user9099
>>> reach 512m and do not flush at all.
>>> 
>>> other memstore related options:
>>> 
>>> hbase.hregion.memstore.mslab.enabled=true
>>> hbase.regionserver.global.memstore.upperLimit=0.4
>>> hbase.regionserver.global.memstore.lowerLimit=0.38
>>> hbase.hregion.memstore.block.multiplier=4
>>> 
>>> 
>>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan <
>> ramkrishna.s.vasudevan@gmail.com> wrote:
>>> 
>>>> Check if the writes are going to that particular region and its rate is
>> too high.  Ensure that the data gets distributed among all regions.
>>>> What is the memstore size?
>>>> 
>>>> If the rate of writes is very high then the flushing will get queued
>> and until the memstore gets flushed such that it goes down the global upper
>> limit writes will be blocked.
>>>> 
>>>> I don't have the code now to see the exact config related to memstore.
>>>> 
>>>> Regards
>>>> Ram
>>>> 
>>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <louis.hust@gmail.com>
>> wrote:
>>>> hi all,
>>>> 
>>>> I build an HBASE test environment, with three PC server, with CHD 5.1.0
>>>> 
>>>> pc1 pc2 pc3
>>>> 
>>>> pc1 and pc2 as HMASTER and hadoop namenode
>>>> pc3 as RegionServer and datanode
>>>> 
>>>> Then I create user as following:
>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
>> "user#{1000+i*(9999-1000)/100}"} }
>>>> Using YCSB for load data as following:
>>>> 
>>>> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
>> columnfamily=family -p recordcount=1000000000   -p threadcount=32  -s  >
>> result/workloadc
>>>> 
>>>> 
>>>> But when after a while, the ycsb return with following error:
>>>> 
>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
>> attempt=35/35 failed 715 ops, last exception:
>> org.apache.hadoop.hbase.RegionTooBusyException:
>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
>> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
>> server=l-hbase10.dba.cn1,60020,1416451280772, memstoreSize=536897120,
>> blockingMemStoreSize=536870912
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
>>>>        at
>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
>>>>        at
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
>>>>        at
>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>>>>        at
>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>>>        at
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>>>>        at
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>>>>        at
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>>>>        at java.lang.Thread.run(Thread.java:744)
>>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu Nov 20
>> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
>>>> 
>>>> 
>>>> It seems the user9099 region is too busy, so I lookup the memstore
>> metrics in web:
>>>> 
>>>> 
>>>> As you see, the user9099 is bigger than other region, I think it is
>> flushing, but after a while, it does not change to a small size and YCSB
>> quit finally.
>>>> 
>>>> But when i change the concurrency threads to 4, all is right. I want to
>> know why?
>>>> 
>>>> Any idea will be appreciated.
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 


Mime
View raw message