Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 21 Sep 2012 16:29:08 +1100 (NCT)
From: "Cheng Hao (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <2113316877.106277.1348205348073.JavaMail.jiratomcat@arcas>
In-Reply-To: <1241082207.105832.1348196949209.JavaMail.jiratomcat@arcas>
Subject: [jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit
 costs too much while full scanning a table with all of its fields
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460224#comment-13460224 ] 

Cheng Hao commented on HBASE-6852:
----------------------------------

@stack: it should make more sense if we put the close() into the AbastractHFileReader, but not sure if there any other concern, since the AbstractHFileReader doesn't have it.

And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope it's big enough for reducing the overhead, and less impact for getting the metrics snapshot timely. sorry, I may not able to give a good experiential number for it.

@Lars: Yes, that's right, we're still updating an AtomicLong each time, but from profiling result, I didn't see the AtomicLong became the new hotspots, and the testing also did >10% saved in running time, which may means the overhead of AtomicLong could be ignored.
                
> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6852
>                 URL: https://issues.apache.org/jira/browse/HBASE-6852
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 0.94.0
>            Reporter: Cheng Hao
>            Priority: Minor
>              Labels: performance
>             Fix For: 0.94.2, 0.96.0
>
>         Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 5000000
> samples  %        image name               symbol name
> -------------------------------------------------------------------------------
> 98447    13.4324  14033.jo                 void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean)
>   98447    100.000  14033.jo                 void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self]
> -------------------------------------------------------------------------------
> 45814     6.2510  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int)
>   45814    100.000  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 43523     5.9384  14033.jo                 boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523    100.000  14033.jo                 boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self]
> -------------------------------------------------------------------------------
> 42548     5.8054  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int)
>   42548    100.000  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 40572     5.5358  14033.jo                 int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572    100.000  14033.jo                 int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira