Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5967EC754 for ; Sun, 11 Jan 2015 00:20:01 +0000 (UTC) Received: (qmail 49172 invoked by uid 500); 11 Jan 2015 00:20:00 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 49093 invoked by uid 500); 11 Jan 2015 00:20:00 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 49057 invoked by uid 99); 11 Jan 2015 00:19:59 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Jan 2015 00:19:59 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.160.178 as permitted sender) Received: from [209.85.160.178] (HELO mail-yk0-f178.google.com) (209.85.160.178) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 11 Jan 2015 00:19:34 +0000 Received: by mail-yk0-f178.google.com with SMTP id 20so6777961yks.9 for ; Sat, 10 Jan 2015 16:19:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=FER+orYDcX/SquhM8q7J7Ge4SUwYS4K0MPqrVgkrhIQ=; b=Xjos5r/ZEP7QhE9YcACKbYeZ4SL7VDyB/WojEgpn69VzLI7yN3BNs933BmuXHKHB0J VZmABEwQA2ah5WCDngdUOqP41+LGzA893wMDK45/kyG0nQY1UX7V/m0UEDwPKuanif3q Lg5aBLj9U19rk29WHBSjSygUPxtulEotQWpiKMosi+uikDPPIc66AoxvTSof9ySHu+/t X2+IfWdhLlC0lZAMdNkTe6nZQAsGuRT0LrBeScwiuXM9afwy5PBsrNYCnmbWNutcEfCx edf/ks7rKVYuUJSnwhIzAVYy/JKVTxjqzg5c6kR2aT8CxJ0cbvZwrX6WV1kZrxKdNER7 bA2w== MIME-Version: 1.0 X-Received: by 10.236.223.8 with SMTP id u8mr17597513yhp.150.1420935573206; Sat, 10 Jan 2015 16:19:33 -0800 (PST) Received: by 10.170.139.4 with HTTP; Sat, 10 Jan 2015 16:19:33 -0800 (PST) In-Reply-To: References: Date: Sat, 10 Jan 2015 16:19:33 -0800 Message-ID: Subject: Re: Low CPU usage and slow reads in pseudo-distributed mode - how to fix? From: Ted Yu To: "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=001a1134b6ce16fdfb050c55586b X-Virus-Checked: Checked by ClamAV on apache.org --001a1134b6ce16fdfb050c55586b Content-Type: text/plain; charset=UTF-8 Please see http://hbase.apache.org/book.html#perf.reading I guess you use 0.90.4 because of Nutch integration. Still 0.90.x was way too old. bq. HBase has a heapsize of 1.5 Gigs This is not enough memory for good read performance. Please consider giving HBase more heap. Cheers On Sat, Jan 10, 2015 at 4:04 PM, Dave Benson wrote: > Hi HBase users, > > I'm working HBase for the first time and I'm trying to sort out a > performance issue. HBase is the data store for a small, focused web crawl > I'm performing with Apache Nutch. I'm running in pseudo-distributed mode, > meaning that Nutch, HBase and Hadoop are all on the same machine. The > machine's a few years old and has only 4 gigs of RAM - much smaller than > most HBase installs, I know. > > When I first start my HBase processes I get about 60 seconds of fast > performance. Hbase reads quickly and uses a healthy portion CPU cycles. > After a minute or so, though, HBase slows dramatically. Reads sink to a > glacial pace, and the CPU sits mostly idle. > > I notice this pattern when I run Nutch - particularly during read-heavy > operations - but also when I run a simple row counter from the shell. > > At the moment " count 'my_table' " takes almost 4 hours to read through 500 > 000 rows. The reading is much faster at the start than the end. In the > first 30 seconds, HBase counts 37000 rows, but in the 30 seconds between > 8:00 and 8:30, only 1000 are counted. > > Looking through my Ganglia report I see a brief return to high performance > around 3 hours into the count. I don't know what's causing this spike. > > > Can anyone suggest what configuration parameters I should change to improve > read performance? Or what reference materials I should consult to better > understand the problem? Again, I'm totally new to HBase. > > I'm using HBase 0.90.4 and Hadoop 1.2.2. HBase has a heapsize of 1.5 Gigs. > > Here's a Ganglia report covering the 4 hours of " count 'my_table' ": > http://imgur.com/Aa3eukZ > > Please let me know if I can provide any more information. > > Many thanks, > > > Dave > --001a1134b6ce16fdfb050c55586b--