Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 81932 invoked from network); 18 Aug 2009 17:36:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 18 Aug 2009 17:36:11 -0000 Received: (qmail 62643 invoked by uid 500); 18 Aug 2009 17:36:29 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 62584 invoked by uid 500); 18 Aug 2009 17:36:29 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 62574 invoked by uid 99); 18 Aug 2009 17:36:29 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 17:36:29 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of saint.ack@gmail.com designates 74.125.92.26 as permitted sender) Received: from [74.125.92.26] (HELO qw-out-2122.google.com) (74.125.92.26) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 18 Aug 2009 17:36:21 +0000 Received: by qw-out-2122.google.com with SMTP id 8so1165445qwh.35 for ; Tue, 18 Aug 2009 10:36:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to :content-type; bh=9cKIfnxreQNSb7iHm1YBhW70aniV6dT3O/SMp/Fyuck=; b=lLpa0hjl1A/XrNlBjgdKADBj334bTNH6X2yHX3KzG5iXFhy3jkqyLs/nxFGngSLODS BYAK/S/iReAbAPYGTchCWdZ8R2LiUKM2mXIpcoKxrl/VloGsLsHWUFHAqOL53LTp2MmH 1iY+i63+iU5RpztDFAAhuszoMvH3YDc12oRmo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; b=M6NqrQJWqb6FgHvUFSo11b77TQSfUVs5aZrWateksopsT0cgawWs8OqHfln2S69ckZ +WDYTZFr7xbmGz9FkLruIsUIlYaxlrdp7VPxUfSXPnI9MHe9F2bkbVf76o3ba7Wa8VCT S+rPjaUKep8uYadbV9bhh6t2GGZ4ReKaTWq1c= MIME-Version: 1.0 Sender: saint.ack@gmail.com Received: by 10.229.106.219 with SMTP id y27mr2571765qco.49.1250616960106; Tue, 18 Aug 2009 10:36:00 -0700 (PDT) In-Reply-To: <4A8AE4C4.8020209@streamy.com> References: <4A8AE4C4.8020209@streamy.com> Date: Tue, 18 Aug 2009 10:36:00 -0700 X-Google-Sender-Auth: 3010a8c5aa9e081b Message-ID: <7c962aed0908181036m4339160t23b3c5e4f97b1680@mail.gmail.com> Subject: Re: HBase-0.20.0 Performance Evaluation From: stack To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=002354471070a8402204716df3a2 X-Virus-Checked: Checked by ClamAV on apache.org --002354471070a8402204716df3a2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit What do you have for GC config Schubert? Now its 8ms a random read? St.Ack On Tue, Aug 18, 2009 at 10:28 AM, Jonathan Gray wrote: > Schubert, > > I can't think of any reason your random reads would get slower after > inserting more data, besides GC issues. > > Do you have GC logging and JVM metrics logging turned on? I would inspect > those to see if you have any long-running GC pauses, or just lots and lots > of GC going on. > > If I recall, you are running on 4GB nodes, 2GB RS heap, and cohosted > DataNodes and TaskTrackers. We ran for a long time on a similar setup, but > once we moved to 0.20 (and to the CMS garbage collector), we really needed > to add more memory to the nodes and increase RS heap to 4 or 5GB. The CMS > GC is less efficient in memory, but if given sufficient resources, is much > better for overall performance/throughput. > > Also, do you have Ganglia setup? Are you seeing swapping on your RS nodes? > Is there high IO-wait CPU usage? > > JG > > > Schubert Zhang wrote: > >> Addition. >> Only random-reads become very slow, scans and sequential-reads are ok. >> >> >> On Tue, Aug 18, 2009 at 6:02 PM, Schubert Zhang >> wrote: >> >> stack and J-G, Thank you very much for your helpful comment. >>> >>> But now, we find such a critical issue for random reads. >>> I use sequentical-writes to insert 5GB of data in our HBase table from >>> empty, and ~30 regions are generated. Then the random-reads takes about >>> 30 >>> minutes to complete. And then, I run the sequentical-writes again. Thus, >>> another version of each cell are inserted, thus ~60 regions are >>> generated. >>> But, we I ran the random-reads again to this table, it always take long >>> time >>> (more than 2 hours). >>> >>> I check the heap usage and other metrics, does not find the reason. >>> >>> Bellow is the status of one region server: >>> request=0.0, regions=13, stores=13, storefiles=14, storefileIndexSize=2, >>> memstoreSize=0, usedHeap=1126, maxHeap=1991, blockCacheSize=338001080, >>> blockCacheFree=79686056, blockCacheCount=5014, blockCacheHitRatio=55 >>> >>> Schubert >>> >>> >>> On Tue, Aug 18, 2009 at 5:02 AM, Schubert Zhang >>> wrote: >>> >>> We have just done a Performance Evaluation on HBase-0.20.0. >>>> Refers to: >>>> >>>> http://docloud.blogspot.com/2009/08/hbase-0200-performance-evaluation.html >>>> >>>> >>> >> --002354471070a8402204716df3a2--