hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheng Hao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
Date Mon, 05 Nov 2012 14:16:24 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13490649#comment-13490649
] 

Cheng Hao commented on HBASE-6852:
----------------------------------

@Lars, thank you for the committing;
The snapshot of 0.94 branch code improves about 17.7% for scanning in my case, and it's sure
the HBASE-6032 helps a lot; 
Here is the new hotspots for RegionServer via OProfile:

{code:title=Hotspots|borderStyle=solid}
CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No
unit mask) count 5000000
samples  %        image name               symbol name
183371   17.1144  4465.jo                  int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[],
int, int, byte[], int, int)
63267     5.9049  4465.jo                  org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
59762     5.5777  4465.jo                  byte[] org.apache.hadoop.hbase.KeyValue.createByteArray(byte[],
int, int, byte[], int, int, byte[], int, int, long, org.apache.hadoop.hbase.KeyValue$Type,
byte[], int, int)
50975     4.7576  4465.jo                  int org.apache.hadoop.hbase.io.hfile.HFileReaderV2$ScannerV2.blockSeek(byte[],
int, int, boolean)
50891     4.7498  4465.jo                  void org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek()
38257     3.5706  4465.jo                  jbyte_disjoint_arraycopy
37973     3.5441  4465.jo                  boolean org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(boolean,
org.apache.hadoop.hbase.KeyValue, boolean, boolean)~1
33978     3.1712  4465.jo                  void org.apache.hadoop.util.PureJavaCrc32C.update(byte[],
int, int)
{code}
                
> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of
its fields
> ------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6852
>                 URL: https://issues.apache.org/jira/browse/HBASE-6852
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>    Affects Versions: 0.94.0
>            Reporter: Cheng Hao
>            Assignee: Cheng Hao
>            Priority: Minor
>              Labels: performance
>             Fix For: 0.94.3
>
>         Attachments: 6852-0.94_2.patch, 6852-0.94_3.patch, 6852-0.94.txt, metrics_hotspots.png,
onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for
the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00
(No unit mask) count 5000000
> samples  %        image name               symbol name
> -------------------------------------------------------------------------------
> 98447    13.4324  14033.jo                 void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
boolean)
>   98447    100.000  14033.jo                 void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
boolean) [self]
> -------------------------------------------------------------------------------
> 45814     6.2510  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[],
int, int, byte[], int, int)
>   45814    100.000  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[],
int, int, byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 43523     5.9384  14033.jo                 boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523    100.000  14033.jo                 boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
[self]
> -------------------------------------------------------------------------------
> 42548     5.8054  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[],
int, int, byte[], int, int)
>   42548    100.000  14033.jo                 int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[],
int, int, byte[], int, int) [self]
> -------------------------------------------------------------------------------
> 40572     5.5358  14033.jo                 int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572    100.000  14033.jo                 int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message