Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 39355 invoked from network); 12 Oct 2010 00:21:56 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Oct 2010 00:21:56 -0000 Received: (qmail 82392 invoked by uid 500); 12 Oct 2010 00:21:56 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 82365 invoked by uid 500); 12 Oct 2010 00:21:56 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 82355 invoked by uid 99); 12 Oct 2010 00:21:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 00:21:56 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 12 Oct 2010 00:21:54 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o9C0LWOw010707 for ; Tue, 12 Oct 2010 00:21:32 GMT Message-ID: <30536916.86451286842892408.JavaMail.jira@thor> Date: Mon, 11 Oct 2010 20:21:32 -0400 (EDT) From: "Kannan Muthukkaruppan (JIRA)" To: issues@hbase.apache.org Subject: [jira] Updated: (HBASE-3103) investigate/improve compaction performance In-Reply-To: <14369954.81421286823573954.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-3103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kannan Muthukkaruppan updated HBASE-3103: ----------------------------------------- Attachment: profiler_data.jpg Find attached profiler screenshot. Some highlights: hfile.Compression$FinishOnFlushCompressionStream.write - 31% StoreScanner.next - 28% (HFile$Reader.decompression is 11% & ScanQueryMatcher.match - 8%) ByteBloomFilter.add - 20% hfile.HFile$Writer.finishBlock - 4% > investigate/improve compaction performance > ------------------------------------------ > > Key: HBASE-3103 > URL: https://issues.apache.org/jira/browse/HBASE-3103 > Project: HBase > Issue Type: Improvement > Reporter: Kannan Muthukkaruppan > Attachments: profiler_data.jpg > > > I was running some tests and am seeing that major compacting about 100M of data seems to take around 40-50 seconds. > My simplified test case is something like: > * Created about a 100M store file (800M uncompressed). > * 10k keys with 1k columns each (avg. key size: 30 bytes; avg. value size: 45 bytes) > * Compression and ROWCOL bloom was turned on. > The test was to major compact this single store file into a new file. > Added some nanoTime() calls around these three stages: > * Scanner.next operations > * bloom computation logic in: StoreFile:append() > * StoreFile.Writer.append() > This is what I saw for these three stages: > {code} > 2010-10-11 11:25:39,774 INFO org.apache.hadoop.hbase.regionserver.Store: major Compaction scanTime (ns) 4338103000 > 2010-10-11 11:25:39,774 INFO org.apache.hadoop.hbase.regionserver.Store: major Compaction bloom only time (ns) 14433821000 > 2010-10-11 11:25:39,774 INFO org.apache.hadoop.hbase.regionserver.Store: major Compaction append time (ns) 23191478000 > {code} > The HFile.getReadTime() and HFile.getWriteTime() themselves seems pretty low (under 1 second levels). These are the times for the parts that interact with the DFS (readBlock() and finishBlock() mostly). > Are these numbers roughly in line with what others are seeing normally? > Will double check my instrumentations, and try to get more data. Might try to run it under a profiler. But wanted to put it out there for additional input/ideas on improvement. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.