Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 73574DB1F for ; Fri, 21 Sep 2012 05:29:13 +0000 (UTC) Received: (qmail 75833 invoked by uid 500); 21 Sep 2012 05:29:12 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 75568 invoked by uid 500); 21 Sep 2012 05:29:09 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 75506 invoked by uid 99); 21 Sep 2012 05:29:08 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Sep 2012 05:29:08 +0000 Date: Fri, 21 Sep 2012 16:29:08 +1100 (NCT) From: "Cheng Hao (JIRA)" To: issues@hbase.apache.org Message-ID: <2113316877.106277.1348205348073.JavaMail.jiratomcat@arcas> In-Reply-To: <1241082207.105832.1348196949209.JavaMail.jiratomcat@arcas> Subject: [jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460224#comment-13460224 ] Cheng Hao commented on HBASE-6852: ---------------------------------- @stack: it should make more sense if we put the close() into the AbastractHFileReader, but not sure if there any other concern, since the AbstractHFileReader doesn't have it. And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope it's big enough for reducing the overhead, and less impact for getting the metrics snapshot timely. sorry, I may not able to give a good experiential number for it. @Lars: Yes, that's right, we're still updating an AtomicLong each time, but from profiling result, I didn't see the AtomicLong became the new hotspots, and the testing also did >10% saved in running time, which may means the overhead of AtomicLong could be ignored. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields > ------------------------------------------------------------------------------------------------ > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics > Affects Versions: 0.94.0 > Reporter: Cheng Hao > Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000 > samples % image name symbol name > ------------------------------------------------------------------------------- > 98447 13.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) > 98447 100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] > ------------------------------------------------------------------------------- > 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) > 45814 100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] > ------------------------------------------------------------------------------- > 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523 100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] > ------------------------------------------------------------------------------- > 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) > 42548 100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] > ------------------------------------------------------------------------------- > 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572 100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira