hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mail list <louis.hust...@gmail.com>
Subject Re: YCSB load data quit because hbase region too busy
Date Tue, 25 Nov 2014 02:22:39 GMT
hi, all

I retest the YCSB load data, and here is a situation which may explain the load data blocked.

I use too many threads to insert values, so the flush thread is not effectively to handle
all memstore,
and the user9099 memstore is queued at last, and waiting for flush too long which blocks the
YCSB request.

Is it possible?


On Nov 21, 2014, at 13:33, Qiang Tian <tianq01@gmail.com> wrote:

> ---pc3 as RegionServer and datanode
> you only have 1 RS(split to 100 region), HDFS replicate to 3? perhaps the
> compaction cannot catch the flush speed,  # of store files hit
> "hbase.hstore.blockingStoreFiles"
> and flush blocked sometime(90s by default). during this period, memstore
> continues to grow and finally reached blocking size(128M*4) ...
> 
> turning on debug can see more messages.  ps "hbase.hregion.majorcompaction"=0
> cannot turn off major compaction.
> 
> 
> 
> 
> On Fri, Nov 21, 2014 at 12:55 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> 
>> Louis:
>> See this thread:
>> http://search-hadoop.com/m/DHED4XrURi2
>> 
>> On Thu, Nov 20, 2014 at 7:33 PM, mail list <louis.hust.ml@gmail.com>
>> wrote:
>> 
>>> If set the target,  YCSB will sleep to control the flow, so it looks like
>>> the same as fewer threads.
>>> But I want to know with heavy write, why some region which exceeds the
>>> limit no flushed.
>>> Maybe like you said, it wait in a flush queue, i will try to repeat the
>>> scenario and lookup the
>>> flush queue.
>>> 
>>> On Nov 21, 2014, at 1:44, Vladimir Rodionov <vladrodionov@gmail.com>
>>> wrote:
>>> 
>>>> Could you please rerun your tests with -target ?
>>>> You should limit # of transactions per second and find the maximum your
>>>> cluster can sustain.
>>>> 
>>>> -Vladimir Rodionov
>>>> 
>>>> On Wed, Nov 19, 2014 at 10:53 PM, mail list <louis.hust.ml@gmail.com>
>>> wrote:
>>>> 
>>>>> hi all,
>>>>> 
>>>>> I build an HBASE test environment, with three PC server, with CHD
>> 5.1.0
>>>>> 
>>>>> pc1 pc2 pc3
>>>>> 
>>>>> pc1 and pc2 as HMASTER and hadoop namenode
>>>>> pc3 as RegionServer and datanode
>>>>> 
>>>>> Then I create user as following:
>>>>> 
>>>>> create 'usertable', 'family', {SPLITS => (1..100).map {|i|
>>> "user#{1000+i*(9999-1000)/100}"} }
>>>>> 
>>>>> 
>>>>> Using YCSB for load data as following:
>>>>> 
>>>>> ./bin/ycsb  load  hbase   -P workloads/workloadc  -p
>> columnfamily=family
>>>>> -p recordcount=1000000000   -p threadcount=32  -s  > result/workloadc
>>>>> 
>>>>> 
>>>>> But when after a while, the ycsb return with following error:
>>>>> 
>>>>> 14/11/20 12:23:44 INFO client.AsyncProcess: #15, table=usertable,
>>>>> attempt=35/35 failed 715 ops, last exception:
>>>>> org.apache.hadoop.hbase.RegionTooBusyException:
>>>>> org.apache.hadoop.hbase.RegionTooBusyException: Above memstore limit,
>>>>> 
>>> 
>> regionName=usertable,user9099,1416453519676.2552d36eb407a8af12d2b58c973d68a9.,
>>>>> server=l-hbase10.dba.cn1.qunar.com,60020,1416451280772,
>>>>> memstoreSize=536897120, blockingMemStoreSize=536870912
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegion.checkResources(HRegion.java:2822)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2234)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2201)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:2205)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.doBatchOp(HRegionServer.java:4253)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.doNonAtomicRegionMutation(HRegionServer.java:3469)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.regionserver.HRegionServer.multi(HRegionServer.java:3359)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:29503)
>>>>>       at
>>> org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2012)
>>>>>       at
>> org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.consumerLoop(SimpleRpcScheduler.java:160)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler.access$000(SimpleRpcScheduler.java:38)
>>>>>       at
>>>>> 
>>> 
>> org.apache.hadoop.hbase.ipc.SimpleRpcScheduler$1.run(SimpleRpcScheduler.java:110)
>>>>>       at java.lang.Thread.run(Thread.java:744)
>>>>> on l-hbase10.dba.cn1.qunar.com,60020,1416451280772, tracking started
>>> Thu
>>>>> Nov 20 12:15:07 CST 2014, retrying after 20051 ms, replay 715 ops.
>>>>> 
>>>>> 
>>>>> It seems the user9099 region is too busy, so I lookup the memstore
>>> metrics
>>>>> in web:
>>>>> 
>>>>> 
>>>>> 
>>>>> As you see, the user9099 is bigger than other region, I think it is
>>>>> flushing, but after a while, it does not change to a small size and
>> YCSB
>>>>> quit finally.
>>>>> 
>>>>> The region server is configured as below:
>>>>> 
>>>>> Summary: HP DL380p Gen8, 1 x Xeon E5-2630 v2 2.60GHz, 126GB / 128GB
>>>>> 1600MHz DDR3
>>>>> System: HP ProLiant DL380p Gen8
>>>>> Processors: 1 (of 2) x Xeon E5-2630 v2 2.60GHz 100MHz FSB (HT
>> enabled, 6
>>>>> cores, 24 threads)
>>>>> Memory: 126GB / 128GB 1600MHz DDR3 == 16 x 8GB, 8 x empty
>>>>> Disk: sda (scsi2): 450GB (1%) JBOD == 1 x HP-LOGICAL-VOLUME
>>>>> Disk: sdb (scsi2): 27TB (0%) JBOD == 1 x HP-LOGICAL-VOLUME
>>>>> Disk-Control: ata_piix0: Intel C600/X79 series chipset 4-Port SATA IDE
>>>>> Controller
>>>>> Disk-Control: hpsa0: Hewlett-Packard Company Smart Array Gen8
>>> Controllers
>>>>> Disk-Control: shannon0: Device 1cb0:0275
>>>>> Network: eth0 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>> 40:a8:f0:23:55:fc, 1000Mb/s <full-duplex>
>>>>> Network: eth1 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>> 40:a8:f0:23:55:fd, no carrier
>>>>> Network: eth2 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>> 40:a8:f0:23:55:fe, no carrier
>>>>> Network: eth3 (tg3): Broadcom NetXtreme BCM5719 Gigabit PCIe,
>>>>> 40:a8:f0:23:55:ff, no carrier
>>>>> OS: CentOS 6.4 (Final), Linux 2.6.32-358.23.2.el6.x86_64 x86_64,
>> 64-bit
>>>>> BIOS: HP P70 02/10/2014
>>>>> Hostname: l-hbase10.dba.cn1.qunar.com
>>>>> 
>>>>> And i attach the hbase configuration file.
>>>>> 
>>>>> 
>>>>> I am new to HBase, any idea will be appreciated!
>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 
>> 


Mime
View raw message