Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F0B1911062 for ; Mon, 8 Sep 2014 07:24:15 +0000 (UTC) Received: (qmail 88069 invoked by uid 500); 8 Sep 2014 07:24:13 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 88000 invoked by uid 500); 8 Sep 2014 07:24:13 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 87988 invoked by uid 99); 8 Sep 2014 07:24:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Sep 2014 07:24:12 +0000 X-ASF-Spam-Status: No, hits=2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of kiran.sarvabhotla@gmail.com designates 209.85.216.171 as permitted sender) Received: from [209.85.216.171] (HELO mail-qc0-f171.google.com) (209.85.216.171) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Sep 2014 07:23:46 +0000 Received: by mail-qc0-f171.google.com with SMTP id x3so14785706qcv.2 for ; Mon, 08 Sep 2014 00:23:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=7rhpTmGw2kIFLcMv50vCa4VMdNN7E5U5JmNb/ctc84I=; b=KbqHu1PHh9a5ny2IpzFFfr4ba9L+j4ef15abWbmJkdu+w0BZbmUs5huE/JxoBg39Op 2opOJrQXuTxr0+J6dHd6qHZg6yK4Llh0XXoX2OaBGnYaQMS8rvcUTaU6KSnIGjLXlPJn EeBOulKprapKlTBbAIx7y+1Oxbmg8oQO2JB3ucWUxkCEOAsl6ZQ41v+7l/0uJ3AbVjiC FuYEhL4Fn/YRxZ641NlvnSB+EGPDe8J+Pe394TYqPNTqs5+RQO83S9uigOGcMhPjsioI ttu2+1RJ8STUSdoQQAW9f9ljjvYpQRohC0A9F62sHoIijDfWI7zfXFgkt/yeKvv5sdwm ecYw== MIME-Version: 1.0 X-Received: by 10.140.92.97 with SMTP id a88mr18770199qge.85.1410161025448; Mon, 08 Sep 2014 00:23:45 -0700 (PDT) Received: by 10.96.92.73 with HTTP; Mon, 8 Sep 2014 00:23:45 -0700 (PDT) In-Reply-To: <1410059541.11069.YahooMailNeo@web140606.mail.bf1.yahoo.com> References: <1366798344605-4042836.post@n3.nabble.com> <1366828979.20216.YahooMailNeo@web140602.mail.bf1.yahoo.com> <1410059541.11069.YahooMailNeo@web140606.mail.bf1.yahoo.com> Date: Mon, 8 Sep 2014 12:53:45 +0530 Message-ID: Subject: Re: HBase - Performance issue From: kiran To: lars hofhansl Cc: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a113a99a6ff83b5050288b2a6 X-Virus-Checked: Checked by ClamAV on apache.org --001a113a99a6ff83b5050288b2a6 Content-Type: text/plain; charset=UTF-8 Hi Lars, Ours is a problem of I/O wait and network bandwidth increase around the same time.... Lars, Sorry to say this... our's is a production cluster and we ideally should never want a downtime... Also lars, we had very miserable experience while upgrading from 0.92 to 0.94... There was a never a mention of change in split policy in the release notes... and the policy was not ideal for our cluster and it took us atleast a week to figure out that.... Our cluster runs on commodity hardware with big regions (5-10gb)... Region sever mem is 10gb... 2TB SATA Hard disks (5400 - 7200 rpm)... Internal network bandwidth is 1 gig So please suggest us any work around with 0.94.1.... On Sun, Sep 7, 2014 at 8:42 AM, lars hofhansl wrote: > Thinking about it again, if you ran into a HBASE-7336 you'd see high CPU > load, but *not* IOWAIT. > 0.94 is at 0.94.23, you should upgrade. A lot of fixes, improvements, and > performance enhancements went in since 0.94.4. > You can do a rolling upgrade straight to 0.94.23. > > With that out of the way, can you post a jstack of the processes that > experience high wait times? > > -- Lars > > ------------------------------ > *From:* kiran > *To:* user@hbase.apache.org; lars hofhansl > *Sent:* Saturday, September 6, 2014 11:30 AM > *Subject:* Re: HBase - Performance issue > > Lars, > > We are facing a similar situation on the similar cluster configuration... > We are having high I/O wait percentages on some machines in our cluster... > We have short circuit reads enabled but still we are facing the similar > problem.. the cpu wait goes upto 50% also in some case while issuing scan > commands with multiple threads.. Is there a work around other than applying > the patch for 0.94.4 ?? > > Thanks > Kiran > > > On Thu, Apr 25, 2013 at 12:12 AM, lars hofhansl wrote: > > You may have run into https://issues.apache.org/jira/browse/HBASE-7336 > (which is in 0.94.4) > (Although I had not observed this effect as much when short circuit reads > are enabled) > > > > ----- Original Message ----- > From: kzurek > To: user@hbase.apache.org > Cc: > Sent: Wednesday, April 24, 2013 3:12 AM > Subject: HBase - Performance issue > > The problem is that when I'm putting my data (multithreaded client, ~30MB/s > traffic outgoing) into the cluster the load is equally spread over all > RegionServer with 3.5% average CPU wait time (average CPU user: 51%). When > I've added similar, mutlithreaded client that Scans for, let say, 100 last > samples of randomly generated key from chosen time range, I'm getting high > CPU wait time (20% and up) on two (or more if there is higher number of > threads, default 10) random RegionServers. Therefore, machines that held > those RS are getting very hot - one of the consequences is that number of > store file is constantly increasing, up to the maximum limit. Rest of the > RS > are having 10-12% CPU wait time and everything seems to be OK (number of > store files varies so they are being compacted and not increasing over > time). Any ideas? Maybe I could prioritize writes over reads somehow? Is > it > possible? If so what would be the best way to that and where it should be > placed - on the client or cluster side)? > > Cluster specification: > HBase Version 0.94.2-cdh4.2.0 > Hadoop Version 2.0.0-cdh4.2.0 > There are 6xDataNodes (5xHDD for storing data), 1xMasterNodes > Other settings: > - Bloom filters (ROWCOL) set > - Short circuit turned on > - HDFS Block Size: 128MB > - Java Heap Size of Namenode/Secondary Namenode in Bytes: 8 GiB > - Java Heap Size of HBase RegionServer in Bytes: 12 GiB > - Java Heap Size of HBase Master in Bytes: 4 GiB > - Java Heap Size of DataNode in Bytes: 1 GiB (default) > Number of regions per RegionServer: 19 (total 114 regions on 6 RS) > Key design: -> UUID: 1-10M, TIMESTAMP: 1-N > Table design: 1 column family with 20 columns of 8 bytes > > Get client: > Multiple threads > Each thread have its own tables instance with their Scanner. > Each thread have its own range of UUIDs and randomly draws beginning of > time > range to build rowkey properly (see above). > Each time Scan requests same amount of rows, but with random rowkey. > > > > > > -- > View this message in context: > http://apache-hbase.679495.n3.nabble.com/HBase-Performance-issue-tp4042836.html > Sent from the HBase User mailing list archive at Nabble.com. > > > > > -- > Thank you > Kiran Sarvabhotla > > -----Even a correct decision is wrong when it is taken late > > > > -- Thank you Kiran Sarvabhotla -----Even a correct decision is wrong when it is taken late --001a113a99a6ff83b5050288b2a6--