hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohit Dev <rohitdeve...@gmail.com>
Subject Re: Hbase tuning for heavy write cluster
Date Sun, 26 Jan 2014 11:35:05 GMT
Hi Vladimir,

Here is my cluster status:

Cluster Size: 26
Server memory: 128GB
Total Writes per sec (data): 450 Mbps
Writes per sec (count) per server: avg ~800 writes/sec (some spikes
upto 3000 writes/sec)
Max Region Size: 16GB
Regions per server: ~140 (not sure if I would be able to merge some
empty regions while table is online)
We are running CDH 4.3

Recently I changed setttings to:
 Java heap size for Region Server: 32GB
 hbase.hregion.memstore.flush.size: 536870912
 hbase.hstore.blockingStoreFiles: 30
 hbase.hstore.compaction.max: 15
 hbase.hregion.memstore.block.multiplier: 3
 hbase.regionserver.maxlogs: 90 (it is too high for 512MB memstore flush size ?)

I'm seeing weird stuff, like one region has grown upto 34GB! and has
21 store files. MAX_FILESIZE for this table is only 16GB.
Could this be a problem ?


On Sat, Jan 25, 2014 at 9:49 PM, Vladimir Rodionov
<vrodionov@carrieriq.com> wrote:
> What is the load (ingestion) rate per server in your cluster?
>
> Best regards,
> Vladimir Rodionov
> Principal Platform Engineer
> Carrier IQ, www.carrieriq.com
> e-mail: vrodionov@carrieriq.com
>
> ________________________________________
> From: Rohit Dev [rohitdevel14@gmail.com]
> Sent: Saturday, January 25, 2014 6:09 PM
> To: user@hbase.apache.org
> Subject: Re: Hbase tuning for heavy write cluster
>
> Compaction queue is ~600 in one of the Region-Server, while it is less
> than 5 is others (total 26 nodes).
> Compaction queue started going up after I increased the settings[1].
> In general, one Major compaction takes about 18 Mins.
>
> In the same region-server I'm seeing these two log messages frequently:
>
> 2014-01-25 17:56:27,312 INFO
> org.apache.hadoop.hbase.regionserver.wal.HLog: Too many hlogs:
> logs=167, maxlogs=32; forcing flush of 1 regions(s):
> 3788648752d1c53c1ec80fad72d3e1cc
>
> 2014-01-25 17:57:48,733 INFO
> org.apache.hadoop.hbase.regionserver.HRegion: Blocking updates for
> 'IPC Server handler 53 on 60020' on region
> tsdb,\x008WR\xE2+\x90\x00\x00\x02Qu\xF1\x00\x00(\x00\x97A\x00\x008M(7\x00\x00Bl\xE85,1390623438462.e6692a1f23b84494015d111954bf00db.:
> memstore size 1.5 G is >= than blocking 1.5 G size
>
> Any suggestion what else I can do or is ok to ignore these messages ?
>
>
> [1]
> New settings are:
>  - hbase.hregion.memstore.flush.size - 536870912
>  - hbase.hstore.blockingStoreFiles - 30
>  - hbase.hstore.compaction.max - 15
>  - hbase.hregion.memstore.block.multiplier - 3
>
> On Sat, Jan 25, 2014 at 3:00 AM, Ted Yu <yuzhihong@gmail.com> wrote:
>> Yes, it is normal.
>>
>> On Jan 25, 2014, at 2:12 AM, Rohit Dev <rohitdevel14@gmail.com> wrote:
>>
>>> I changed these settings:
>>> - hbase.hregion.memstore.flush.size - 536870912
>>> - hbase.hstore.blockingStoreFiles - 30
>>> - hbase.hstore.compaction.max - 15
>>> - hbase.hregion.memstore.block.multiplier - 3
>>>
>>> Things seems to be getting better now, not seeing any of those
>>> annoying ' Blocking updates' messages. Except that, I'm seeing
>>> increase in 'Compaction Queue' size on some servers.
>>>
>>> I noticed memstores are getting flushed, but some with 'compaction
>>> requested=true'[1]. Is this normal ?
>>>
>>>
>>> [1]
>>> INFO org.apache.hadoop.hbase.regionserver.HRegion: Finished memstore
>>> flush of ~512.0 M/536921056, currentsize=3.0 M/3194800 for region
>>> tsdb,\x008ZR\xE1t\xC0\x00\x00\x02\x01\xB0\xF9\x00\x00(\x00\x0B]\x00\x008M((\x00\x00Bk\x9F\x0B,1390598160292.7fb65e5fd5c4cfe08121e85b7354bae9.
>>> in 3422ms, sequenceid=18522872289, compaction requested=true
>>>
>>> On Fri, Jan 24, 2014 at 6:51 PM, Bryan Beaudreault
>>> <bbeaudreault@hubspot.com> wrote:
>>>> Also, I think you can up the hbase.hstore.blockingStoreFiles quite a bit
>>>> higher.  You could try something like 50.  It will reduce read performance
>>>> a bit, but shouldn't be too bad especially for something like opentsdb I
>>>> think.  If you are going to up the blockingStoreFiles you're probably also
>>>> going to want to up hbase.hstore.compaction.max.
>>>>
>>>> For my tsdb cluster, which is 8 i2.4xlarges in EC2, we have 90 regions for
>>>> tsdb.  We were also having issues with blocking, and I upped
>>>> blockingStoreFiles to 35, compaction.max to 15, and
>>>> memstore.block.multiplier to 3.  We haven't had problems since.  Memstore
>>>> flushsize for the tsdb table is 512MB.
>>>>
>>>> Finally, 64GB heap may prove problematic, but it's worth a shot.  I'd
>>>> definitely recommend java7 with the G1 garbage collector though.  In
>>>> general, Java would have a hard time with heap sizes greater than 20-25GB
>>>> without some careful tuning.
>>>>
>>>>
>>>> On Fri, Jan 24, 2014 at 9:44 PM, Bryan Beaudreault <bbeaudreault@hubspot.com
>>>>> wrote:
>>>>
>>>>> It seems from your ingestion rate you are still blowing through HFiles
too
>>>>> fast.  You're going to want to up the MEMSTORE_FLUSHSIZE for the table
from
>>>>> the default of 128MB.  If opentsdb is the only thing on this cluster,
you
>>>>> can do the math pretty easily to find the maximum allowable, based on
your
>>>>> heap size and accounting for 40% (default) used for the block cache.
>>>>>
>>>>>
>>>>> On Fri, Jan 24, 2014 at 9:38 PM, Rohit Dev <rohitdevel14@gmail.com>
wrote:
>>>>>
>>>>>> Hi Kevin,
>>>>>>
>>>>>> We have about 160 regions per server with 16Gig region size and 10
>>>>>> drives for Hbase. I've looked at disk IO and that doesn't seem to
be
>>>>>> any problem ( % utilization is < 2 across all disk)
>>>>>>
>>>>>> Any suggestion what heap size I should allocation, normally I allocate
>>>>>> 16GB.
>>>>>>
>>>>>> Also, I read increasing  hbase.hstore.blockingStoreFiles and
>>>>>> hbase.hregion.memstore.block.multiplier is good idea for write-heavy
>>>>>> cluster, but in my case it seem to be heading to wrong direction.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> On Fri, Jan 24, 2014 at 6:31 PM, Kevin O'dell <kevin.odell@cloudera.com>
>>>>>> wrote:
>>>>>>> Rohit,
>>>>>>>
>>>>>>>  64GB heap is not ideal, you will run into some weird issues.
How many
>>>>>>> regions are you running per server, how many drives in each node,
any
>>>>>> other
>>>>>>> settings you changed from default?
>>>>>>> On Jan 24, 2014 6:22 PM, "Rohit Dev" <rohitdevel14@gmail.com>
wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> We are running Opentsdb on CDH 4.3 hbase cluster, with most
of the
>>>>>>>> default settings. The cluster is heavy on write and I'm trying
to see
>>>>>>>> what parameters I can tune to optimize the write performance.
>>>>>>>>
>>>>>>>>
>>>>>>>> # I get messages related to Memstore[1] and Slow Response[2]
very
>>>>>>>> often, is this an indication of any issue ?
>>>>>>>>
>>>>>>>> I tried increasing some parameters on one node:
>>>>>>>> - hbase.hstore.blockingStoreFiles - from default 7 to 15
>>>>>>>> - hbase.hregion.memstore.block.multiplier - from default
2 to 8
>>>>>>>> - and heap size from 16GB to 64GB
>>>>>>>>
>>>>>>>> * 'Compaction queue' went up to ~200 within 60 mins after
restarting
>>>>>>>> region server with new parameters and the log started to
get even more
>>>>>>>> noisy.
>>>>>>>>
>>>>>>>> Can anyone please suggest if I'm going to right direction
with these
>>>>>>>> new settings ? or if there are other thing that I could monitor
or
>>>>>>>> change to make it better.
>>>>>>>>
>>>>>>>> Thank you!
>>>>>>>>
>>>>>>>>
>>>>>>>> [1]
>>>>>>>> INFO org.apache.hadoop.hbase.regionserver.HRegion: Blocking
updates
>>>>>>>> for 'IPC Server handler 19 on 60020' on region
>>>>>> tsdb,\x008XR\xE0i\x90\x00\x00\x02Q\x7F\x1D\x00\x00(\x00\x0B]\x00\x008M(r\x00\x00Bl\xA7\x8C,1390556781703.0771bf90cab25c503d3400206417f6bf.:
>>>>>>>> memstore size 256.3 M is >= than blocking 256 M size
>>>>>>>>
>>>>>>>> [2]
>>>>>>>> WARN org.apache.hadoop.ipc.HBaseServer: (responseTooSlow):
>>>>>> {"processingtimems":17887,"call":"multi(org.apache.hadoop.hbase.client.MultiAction@586940ea
>>>>>>>> ),
>>>>>>>> rpc version=1, client version=29,
>>>>>>>> methodsFingerPrint=0","client":"192.168.10.10:54132
>>>>>> ","starttimems":1390587959182,"queuetimems":1498,"class":"HRegionServer","responsesize":0,"method":"multi"}
>>>>>
>>>>>
>
> Confidentiality Notice:  The information contained in this message, including any attachments
hereto, may be confidential and is intended to be read only by the individual or entity to
whom this message is addressed. If the reader of this message is not the intended recipient
or an agent or designee of the intended recipient, please note that any review, use, disclosure
or distribution of this message or its attachments, in any form, is strictly prohibited. 
If you have received this message in error, please immediately notify the sender and/or Notifications@carrieriq.com
and delete or destroy any copy of this message and its attachments.

Mime
View raw message