hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-13291) Lift the scan ceiling
Date Tue, 31 Mar 2015 00:04:02 GMT

     [ https://issues.apache.org/jira/browse/HBASE-13291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack updated HBASE-13291:
--------------------------
    Attachment: 13291.hacks.txt

Adds methods on KV so SQM can use them in SQM#match if a KV and avoid reparse of KV offsets
and lengths.

Small methods so compile and inlining can happen.

Self-position maintenance in readKeyValueLen methods in HFileReaderV2 and V3; faster.

Inlined mvcc vint parse to try and speed it up.

Here is what it looks like currently:

{code}
Samples: 40M of event 'cycles', Event count (approx.): 427157275282
 26.23%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV2$ScannerV2;._next
in Lorg/apache/hadoop/hbase/regionserver/StoreFileScanner;.next
 12.62%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.moreMatch
 11.65%  perf-13101.map      [.] 0x00007f1fed96e4ec
  6.22%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.optimize
in Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.next
  3.87%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  3.63%  perf-13101.map      [.] Ljava/util/PriorityQueue;.poll in Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.pollRealKV
  2.92%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  2.83%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.next
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.populateResult
  1.80%  libjvm.so           [.] BlockOffsetArrayNonContigSpace::block_start_unsafe(void const*)
const
  1.63%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/StoreScanner;.next
  1.56%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/ScanQueryMatcher;.match
  1.53%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/util/Bytes$LexicographicalComparerHolder$UnsafeComparer;.compareTo
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.isStopRow
  1.53%  perf-13101.map      [.] Lorg/apache/hadoop/hbase/regionserver/KeyValueHeap;.peek
in Lorg/apache/hadoop/hbase/regionserver/HRegion$RegionScannerImpl;.nextInternal
  0.93%  libjvm.so           [.] ClearNoncleanCardWrapper::do_MemRegion(MemRegion)
  0.91%  perf-13101.map      [.] Ljava/nio/HeapByteBuffer;.slice in Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV2$ScannerV2;.updateCurrBlock
{code}

_next is all inlined but its bulk is readKeyValueLen which is reading key and value lengths,
tags and mvcc.

Bunch of Unsafe#compareTo. Probably hard to do anything about these with current formats.

I ain't sure what 0x00007f1fed96e4ec is. Symbol not coming through for it.


> Lift the scan ceiling
> ---------------------
>
>                 Key: HBASE-13291
>                 URL: https://issues.apache.org/jira/browse/HBASE-13291
>             Project: HBase
>          Issue Type: Improvement
>          Components: Scanners
>    Affects Versions: 1.0.0
>            Reporter: stack
>            Assignee: stack
>         Attachments: 13291.hacks.txt, 13291.inlining.txt, Screen Shot 2015-03-26 at 12.12.13
PM.png, Screen Shot 2015-03-26 at 3.39.33 PM.png, hack_to_bypass_bb.txt, nonBBposAndInineMvccVint.txt,
q (1).png, traces.7.svg, traces.filterall.svg, traces.nofilter.svg, traces.small2.svg, traces.smaller.svg
>
>
> Scanning medium sized rows with multiple concurrent scanners exhibits interesting 'ceiling'
properties. A server runs at about 6.7k ops a second using 450% of possible 1600% of CPUs
 when 4 clients each with 10 threads doing scan 1000 rows.  If I add '--filterAll' argument
(do not return results), then we run at 1450% of possible 1600% possible but we do 8k ops
a second.
> Let me attach flame graphs for two cases. Unfortunately, there is some frustrating dark
art going on. Let me try figure it... Filing issue in meantime to keep score in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message