Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E1DF10555 for ; Thu, 12 Sep 2013 16:55:08 +0000 (UTC) Received: (qmail 28735 invoked by uid 500); 12 Sep 2013 16:53:13 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 28655 invoked by uid 500); 12 Sep 2013 16:53:12 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 28619 invoked by uid 99); 12 Sep 2013 16:53:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Sep 2013 16:53:10 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of dev.opensource@gmail.com designates 209.85.215.43 as permitted sender) Received: from [209.85.215.43] (HELO mail-la0-f43.google.com) (209.85.215.43) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Sep 2013 16:53:06 +0000 Received: by mail-la0-f43.google.com with SMTP id ep20so57456lab.16 for ; Thu, 12 Sep 2013 09:52:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=acDpJzx87Q5QU7YIZNZc3XWLWtoB18KSCFHTsDOsWX4=; b=BJRpUC94u5HOW58mJjK0DcRPOx66KDc8CqqJmpYobLJWVZ1crOHnKt4Noj0xKqlNTx 3Q+kbMNwj+h0MKPI2XgaNhDZEU3393FIuCMa6huzO0APIGLhjCFCGMXoNSoK0Wb3Fip2 0zBRWqmQ1aotXrjKmwSkOi5vxDP4kBcqe4mUE3333tN5WKOijnmFm5wPgf5oAnx4omGZ /DULye/+YVwPN9Pf8WV6VWFSnNjwf+mV1e3JStDXstkYHNlzuW5eSFl9ilmEH5V76P2r KKXYqCIDLEEl4t9Gld+IEoAimkbHw1BuSl82U793pqiTaz119BXgRNg4kxITkTEvpCbv PLeQ== MIME-Version: 1.0 X-Received: by 10.152.37.166 with SMTP id z6mr6499357laj.25.1379004764516; Thu, 12 Sep 2013 09:52:44 -0700 (PDT) Received: by 10.112.56.165 with HTTP; Thu, 12 Sep 2013 09:52:44 -0700 (PDT) In-Reply-To: <1379000194.18835.YahooMailNeo@web140606.mail.bf1.yahoo.com> References: <1379000194.18835.YahooMailNeo@web140606.mail.bf1.yahoo.com> Date: Thu, 12 Sep 2013 09:52:44 -0700 Message-ID: Subject: Re: High cpu usage on a region server From: OpenSource Dev To: user@hbase.apache.org, lars hofhansl Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Thanks Lars. Are there any other workarounds for this issue until we get the fix ? If not we might have to do the patch and rollout custom pkg. On Thu, Sep 12, 2013 at 8:36 AM, lars hofhansl wrote: > Yep... Very likely HBASE-9428: > > 8 threads: > java.lang.Thread.State: RUNNABLE > at java.util.Arrays.copyOf(Arrays.java:2786) > at java.lang.StringCoding.decode(StringCoding.java:178) > at java.lang.String.(String.java:483) > at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) > ... > > 4 threads: > java.lang.Thread.State: RUNNABLE > at sun.nio.cs.ISO_8859_1$Decoder.decodeArrayLoop(ISO_8859_1.java:79) > at sun.nio.cs.ISO_8859_1$Decoder.decodeLoop(ISO_8859_1.java:106) > at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:544) > at java.lang.StringCoding$StringDecoder.decode(StringCoding.java:140) > at java.lang.StringCoding.decode(StringCoding.java:179) > at java.lang.String.(String.java:483) > at org.apache.hadoop.hbase.filter.RegexStringComparator.compareTo(RegexStringComparator.java:96) > > It's also consistent with what you see: Lots of garbage (hence tweaking your GC options had a significant effect) > The fix is in 0.94.12, which is in RC right now, probably to be released early next week. > > -- Lars > > > > ________________________________ > From: OpenSource Dev > To: user@hbase.apache.org > Sent: Thursday, September 12, 2013 8:15 AM > Subject: Re: High cpu usage on a region server > > > A server started getting busy last night, but this time it took ~5 hrs > to get from 15% busy to 75% busy. It is not running 80% flat-out yet. > But this is still very high compared to other servers that are running > under ~25% cpu usage. Only change that I made yesterday was the > addition of "-XX:+UseParNewGC" to hbase startup command. > > http://pastebin.com/VRmujgyH > > On Wed, Sep 11, 2013 at 2:28 PM, Stack wrote: >> Can you thread dump the busy server and pastebin it? >> Thanks, >> St.Ack >> >> >> On Wed, Sep 11, 2013 at 1:49 PM, OpenSource Dev wrote: >> >>> Hi, >>> >>> I'm using HBase 0.94.6 (CDH 4.3) for Opentsdb. So far I have had no >>> issues with writes/puts. System is handles upto 800k puts per seconds >>> without issue. On average we do 250k puts per second. >>> >>> I am having the problem with Reads, I've also isolated where the >>> problem is but not been able to find the root cause. >>> >>> I have 16 machines running hbase-region server, each has ~35 regions. >>> Once in a while cpu goes flatout 80% in 1 region server. These are the >>> things i've noticed in ganglia: >>> >>> hbase.regionserver.request - evenly distributed. Not seeing any spikes >>> on the busy server >>> hbase.regionserver.blockCacheSize - between 500MB and 1000MB >>> hbase.regionserver.compactionQueueSize - avg 2 or less >>> hbase.regionserver.blockCacheHitRatio - 30% on busy node, >60% on other >>> nodes >>> >>> >>> JVM Heap size is set to 16GB and I'm using -XX:+UseParNewGC >>> -XX:+UseConcMarkSweepGC >>> >>> I've noticed the system load moves to a different region, sometimes >>> within a minute, if the busy region is restarted. >>> >>> Any suggestion what could be causing the load and/or what other >>> metrics should I check ? >>> >>> >>> Thank you! >>>