hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Qiang Tian <tian...@gmail.com>
Subject Re: YCSB load failed because hbase region too busy
Date Tue, 25 Nov 2014 08:50:00 GMT
in your log:
2014-11-25 13:31:35,048 WARN  [MemStoreFlusher.13]
regionserver.MemStoreFlusher: Region
usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875. has too
many store files; delaying flush up to 90000ms

please see my original reply...you can try increasing
"hbase.hstore.blockingStoreFiles", also you have only 1 RS and you split to
100 regions....you can try 2 RS with 20 regions.



On Tue, Nov 25, 2014 at 3:42 PM, louis.hust <louis.hust@gmail.com> wrote:

> yes, the stack trace like below:
>
> 2014-11-25 13:35:40:946 4260 sec: 232700856 operations; 28173.18 current
> ops/sec; [INSERT AverageLatency(us)=637.59]
> 2014-11-25 13:35:50:946 4270 sec: 232700856 operations; 0 current ops/sec;
> 14/11/25 13:35:59 INFO client.AsyncProcess: #14, table=usertable2,
> attempt=10/35 failed 109 ops, last exception:
> org.apache.hadoop.hbase.RegionTooBusyException:
> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
> regionName=usertable2,user8289,1416889268210.7e8fd83bb34b155bd0385aa63124a875.,
> server=l-hbase10.dba.cn1.qunar.com,60020,1416889404151,
> memstoreSize=536886800, blockingMemStoreSize=536870912
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
>         at
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
>         at
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
>         at
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
>         at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>         at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>         at
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>         at java.lang.Thread.run(Thread.java:744)
>
> Then i loopup the memstore size for user8289, is 512M. and now is still
> 512M(15:40)
>
> The region server log is attached which maybe help.
>
>
>
>
>
>
> On Nov 25, 2014, at 15:27, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > Are you getting any exceptions in the log?  Do you have a stack trace
> when
> > it is blocked?
> >
> > On Tue, Nov 25, 2014 at 12:30 PM, louis.hust <louis.hust@gmail.com>
> wrote:
> >
> >> hi,Ram
> >>
> >> After i modify the  hbase.hstore.flusher.count, it just improve the
> load,
> >> but after one hour , the YCSB
> >> load program is still blocked! Then I change hbase.hstore.flusher.count
> to
> >> 40, but it’s the same as 20,
> >>
> >> On Nov 25, 2014, at 14:47, ramkrishna vasudevan <
> >> ramkrishna.s.vasudevan@gmail.com> wrote:
> >>
> >>>>> hbase.hstore.flusher.count to 20 (default value is 2), and run the
> YCSB
> >>> to load data
> >>> with 32 threads
> >>>
> >>> Apologies for the late reply. Your change of configuraton from 2 to 20
> is
> >>> right in this case because you are data ingest rate is high I suppose.
> >>>
> >>> Thanks for the reply.
> >>>
> >>> Regards
> >>> Ram
> >>>
> >>> On Tue, Nov 25, 2014 at 12:09 PM, louis.hust <louis.hust@gmail.com>
> >> wrote:
> >>>
> >>>> hi, all
> >>>>
> >>>> I retest the YCSB load data, and here is a situation which may explain
> >> the
> >>>> load data blocked.
> >>>>
> >>>> I use too many threads to insert values, so the flush thread is not
> >>>> effectively to handle all memstore,
> >>>> and the user9099 memstore is queued at last, and waiting for flush too
> >>>> long which blocks the YCSB request.
> >>>>
> >>>> Then I modify the configuration, set hbase.hstore.flusher.count to 20
> >>>> (default value is 2), and run the YCSB to load data
> >>>> with 32 threads, it can run for 1 hour (with 2 threads just run for
> less
> >>>> than half 1 hour).
> >>>>
> >>>>
> >>>> On Nov 20, 2014, at 23:20, louis.hust <louis.hust@gmail.com> wrote:
> >>>>
> >>>>> Hi Ram,
> >>>>>
> >>>>> Thanks for your reply!
> >>>>>
> >>>>> I use YCSB workloadc to load data, and from the web request monitor
i
> >>>> can see that
> >>>>> the write requests are distributed among all regions, so i think
the
> >>>> data get distributed,
> >>>>>
> >>>>> And there are 32 thread writing to the region server, may be the
> >>>> concurrency and write rate is too high.
> >>>>> The writes are blocked but the memstore do not get flushed, i want
to
> >>>> know why?
> >>>>>
> >>>>> The jvm heap is 64G and hbase.regionserver.global.memstore.size
is
> >>>> default(0.4) about 25.6G,
> >>>>> and hbase.hregion.memstore.flush.size is default(132M),  but the
> >> blocked
> >>>> memstore user9099
> >>>>> reach 512m and do not flush at all.
> >>>>>
> >>>>> other memstore related options:
> >>>>>
> >>>>> hbase.hregion.memstore.mslab.enabled=true
> >>>>> hbase.regionserver.global.memstore.upperLimit=0.4
> >>>>> hbase.regionserver.global.memstore.lowerLimit=0.38
> >>>>> hbase.hregion.memstore.block.multiplier=4
> >>>>>
> >>>>>
> >>>>> On Nov 20, 2014, at 20:38, ramkrishna vasudevan <
> >>>> ramkrishna.s.vasudevan@gmail.com> wrote:
> >>>>>
> >>>>>> Check if the writes are going to that particular region and
its rate
> >> is
> >>>> too high.  Ensure that the data gets distributed among all regions.
> >>>>>> What is the memstore size?
> >>>>>>
> >>>>>> If the rate of writes is very high then the flushing will get
queued
> >>>> and until the memstore gets flushed such that it goes down the global
> >> upper
> >>>> limit writes will be blocked.
> >>>>>>
> >>>>>> I don't have the code now to see the exact config related to
> memstore.
> >>>>>>
> >>>>>> Regards
> >>>>>> Ram
> >>>>>>
> >>>>>> On Thu, Nov 20, 2014 at 4:50 PM, louis.hust <louis.hust@gmail.com>
> >>>> wrote:
> >>>>>> hi all,
> >>>>>>
> >>>>>> I build an HBASE test environment, with three PC server, with
CHD
> >> 5.1.0
> >>>>>>
> >>>>>> pc1 pc2 pc3
> >>>>>>
> >>>>>> pc1 and pc2 as HMASTER and hadoop namenode
> >>>>>> pc3 as RegionServer and datanode
> >>>>>>
> >>>>>> Then I create user as following:
> >>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
> >>>> "user#{1000+i*(9999-1000)/100}"} }
> >>>>>> Using YCSB for load data as following:
> >>>>>>
> >>>>>> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
> >>>> columnfamily=family -p recordcount=1000000000   -p threadcount=32
> -s  >
> >>>> result/workloadc
> >>>>>>
> >>>>>>
> >>>>>> But when after a while, the ycsb return with following error:
> >>>>>>
> >>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
> >>>> attempt=35/35 failed 715 ops, last exception:
> >>>> org.apache.hadoop.hbase.RegionTooBusyException:
> >>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
> >>>>
> >>
> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
> >>>> server=l-hbase10.dba.cn1,60020,1416451280772, memstoreSize=536897120,
> >>>> blockingMemStoreSize=536870912
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
> >>>>>>       at
> >>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
> >>>>>>       at
> >>>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
> >>>>>>       at
> >>>>
> >>
> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
> >>>>>>       at java.lang.Thread.run(Thread.java:744)
> >>>>>> on l-hbase10.dba.cn1,60020,1416451280772, tracking started Thu
Nov
> 20
> >>>> 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
> >>>>>>
> >>>>>>
> >>>>>> It seems the user9099 region is too busy, so I lookup the memstore
> >>>> metrics in web:
> >>>>>>
> >>>>>>
> >>>>>> As you see, the user9099 is bigger than other region, I think
it is
> >>>> flushing, but after a while, it does not change to a small size and
> YCSB
> >>>> quit finally.
> >>>>>>
> >>>>>> But when i change the concurrency threads to 4, all is right.
I want
> >> to
> >>>> know why?
> >>>>>>
> >>>>>> Any idea will be appreciated.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message