hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-13291) Lift the scan ceiling
Date Tue, 31 Mar 2015 00:04:02 GMT

     [ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack updated HBASE-13291:
    Attachment: 13291.hacks.txt

Adds methods on KV so SQM can use them in SQM#match if a KV and avoid reparse of KV offsets
and lengths.

Small methods so compile and inlining can happen.

Self-position maintenance in readKeyValueLen methods in HFileReaderV2 and V3; faster.

Inlined mvcc vint parse to try and speed it up.

Here is what it looks like currently:

Samples: 40M of event 'cycles', Event count (approx.): 427157275282
 26.23%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV2$ScannerV2;._next
in Lorg/apache/hadoop/hbase/regionserver/StoreFileScanner;.next
 12.62%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.moreMatch
 11.65%  perf-13101.map      [.] 0x00007f1fed96e4ec
  6.22%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.optimize
in Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.next
  3.87%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  3.63%  perf-13101.map      [.] Ljava/util/PriorityQueue;.poll in Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.pollRealKV
  2.92%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  2.83%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.next
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.populateResult
  1.80%  libjvm.so           [.] BlockOffsetArrayNonContigSpace::block_start_unsafe(void const*)
  1.63%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.next
  1.56%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  1.53%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.isStopRow
  1.53%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.peek
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.nextInternal
  0.93%  libjvm.so           [.] ClearNoncleanCardWrapper::do_MemRegion(MemRegion)
  0.91%  perf-13101.map      [.] Ljava/nio/HeapByteBuffer;.slice in Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV2$ScannerV2;.updateCurrBlock

_next is all inlined but its bulk is readKeyValueLen which is reading key and value lengths,
tags and mvcc.

Bunch of Unsafe#compareTo. Probably hard to do anything about these with current formats.

I ain't sure what 0x00007f1fed96e4ec is. Symbol not coming through for it.

> Lift the scan ceiling
> ---------------------
>                 Key: HBASE-13291
>                 URL: https://issues.apache.org/jira/browse/HBASE-13291
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 1.0.0
>            Reporter: stack
>            Assignee: stack
>         Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13
PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt,
q (1).png, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg
> Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling'
properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs
 when 4 clients each with 10 threads doing scan 1000 rows.  If I add '--filterAll' argument
(do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops
a second.
> Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark
art going on. Let me try figure it... Filing issue in meantime to keep score in.

This message was sent by Atlassian JIRA

View raw message