hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From OpenSource Dev <dev.opensou...@gmail.com>
Subject Re: High cpu usage on a region server
Date Sun, 15 Sep 2013 06:21:47 GMT
We patched HBase 0.94.6 with HBASE-9428, and now the difference is as
day and night.
Read latency has been very consistent and haven't seen any cpu load
issue in last 24+hrs

Thank you all for helping us out to resolve this issue.

Bikrant

On Thu, Sep 12, 2013 at 10:25 AM, lars hofhansl <larsh@apache.org> wrote:
> Not that I am aware of. Reduce the HFile block size will lessen this problem (but then
cause other issues).
>
> It's just a fix to the RegexStringFilter. You can just recompile that and deploy it to
the RegionServers (need to make it's in the class path before the HBase jars).
> Probably easier to roll a new release. It's a shame we did not see this earlier.
>
>
> -- Lars
>
>
>
> ________________________________
>  From: OpenSource Dev <dev.opensource@gmail.com>
> To: user@hbase.apache.org; lars hofhansl <larsh@apache.org>
> Sent: Thursday, September 12, 2013 9:52 AM
> Subject: Re: High cpu usage on a region server
>
>
> Thanks Lars.
>
> Are there any other workarounds for this issue until we get the fix ?
> If not we might have to do the patch and rollout custom pkg.
>
> On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl <larsh@apache.org> wrote:
>> Yep... Very likely HBASE-9428:
>>
>> 8 threads:
>>    java.lang.Thread.State: RUNNABLE
>>         at java.util.Arrays.copyOf(Arrays.java:2786)
>>         at java.lang.StringCoding.decode(StringCoding.java:178)
>>         at java.lang.String.<init>(String.java:483)
>>         at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
>>         ...
>>
>> 4 threads:
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79)
>>         at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106)
>>         at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544)
>>         at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140)
>>         at java.lang.StringCoding.decode(StringCoding.java:179)
>>         at java.lang.String.<init>(String.java:483)
>>         at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96)
>>
>> It's also consistent with what you see: Lots of garbage (hence tweaking your GC options
had a significant effect)
>> The fix is in 0.94.12, which is in RC right now, probably to be released early next
week.
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: OpenSource Dev <dev.opensource@gmail.com>
>> To: user@hbase.apache.org
>> Sent: Thursday, September 12, 2013 8:15 AM
>> Subject: Re: High cpu usage on a region server
>>
>>
>> A server started getting busy last night, but this time it took ~5 hrs
>> to get from 15% busy to 75% busy. It is not running 80% flat-out yet.
>> But this is still very high compared to other servers that are running
>> under ~25% cpu usage. Only change that I made yesterday was the
>> addition of "-XX:+UseParNewGC" to hbase startup command.
>>
>> http://pastebin.com/VRmujgyH
>>
>> On Wed, Sep 11, 2013 at 2:28 PM, Stack <stack@duboce.net> wrote:
>>> Can you thread dump the busy server and pastebin it?
>>> Thanks,
>>> St.Ack
>>>
>>>
>>> On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev <dev.opensource@gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no
>>>> issues with writes/puts. System is handles upto 800k puts per seconds
>>>> without issue. On average we do 250k puts per second.
>>>>
>>>> I am having the problem with Reads, I've also isolated where the
>>>> problem is but not been able to find the root cause.
>>>>
>>>> I have 16 machines running hbase-region server, each has ~35 regions.
>>>> Once in a while cpu goes flatout 80% in 1 region server. These are the
>>>> things i've noticed in ganglia:
>>>>
>>>> hbase.regionserver.request - evenly distributed. Not seeing any spikes
>>>> on the busy server
>>>> hbase.regionserver.blockCacheSize - between 500MB and 1000MB
>>>> hbase.regionserver.compactionQueueSize - avg 2 or less
>>>> hbase.regionserver.blockCacheHitRatio - 30% on busy node, >60% on other
>>>> nodes
>>>>
>>>>
>>>> JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC
>>>>
>>>> I've noticed the system load moves to a different region, sometimes
>>>> within a minute, if the busy region is restarted.
>>>>
>>>> Any suggestion what could be causing the load and/or what other
>>>> metrics should I check ?
>>>>
>>>>
>>>> Thank you!
>>>>

Mime
View raw message