hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HBASE-15594) [YCSB] Improvements
Date Thu, 19 May 2016 16:53:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291495#comment-15291495
] 

stack edited comment on HBASE-15594 at 5/19/16 4:52 PM:
--------------------------------------------------------

Random read, here is where CPU is being spent (perf top). We have some work to do:
{code}
  7.25%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV3$ScannerV3;.blockSeek
  6.17%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/bucket/BucketCache;.getBlock
  6.14%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/util/Counter;.add
  4.40%  perf-42125.map      [.] jshort_disjoint_arraycopy
  4.06%  libjvm.so           [.] TypeArrayKlass::allocate_common(int, bool, Thread*)
  2.93%  libjvm.so           [.] SpinPause
  2.43%  perf-42125.map      [.] jint_disjoint_arraycopy
  2.34%  perf-42125.map      [.] jlong_disjoint_arraycopy
  1.38%  perf-42125.map      [.] jbyte_disjoint_arraycopy
  1.35%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileBlockIndex$BlockIndexReader;.binarySearchNonRootIndex
  1.25%  libjvm.so           [.] ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  1.01%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/util/CompoundBloomFilter;.contains
{code}
Little by way of compares in this list since random read. The blockSeek is interesting as
is counter. Need to work on that. The getBlock looks to be the purge from the weak hash map
of all locks taking a bunch of time; shows as point of contention too. TODO after figure how
to hit 100% CPU.


was (Author: stack):
Random read, here is where CPU is being spent. We have some work to do:
{code}
  7.25%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileReaderV3$ScannerV3;.blockSeek
  6.17%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/bucket/BucketCache;.getBlock
  6.14%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/util/Counter;.add
  4.40%  perf-42125.map      [.] jshort_disjoint_arraycopy
  4.06%  libjvm.so           [.] TypeArrayKlass::allocate_common(int, bool, Thread*)
  2.93%  libjvm.so           [.] SpinPause
  2.43%  perf-42125.map      [.] jint_disjoint_arraycopy
  2.34%  perf-42125.map      [.] jlong_disjoint_arraycopy
  1.38%  perf-42125.map      [.] jbyte_disjoint_arraycopy
  1.35%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/io/hfile/HFileBlockIndex$BlockIndexReader;.binarySearchNonRootIndex
  1.25%  libjvm.so           [.] ParallelTaskTerminator::offer_termination(TerminatorTerminator*)
  1.01%  perf-42125.map      [.] Lorg/apache/hadoop/hbase/util/CompoundBloomFilter;.contains
{code}
Little by way of compares in this list since random read. The blockSeek is interesting as
is counter. Need to work on that. The getBlock looks to be the purge from the weak hash map
of all locks taking a bunch of time; shows as point of contention too. TODO after figure how
to hit 100% CPU.

> [YCSB] Improvements
> -------------------
>
>                 Key: HBASE-15594
>                 URL: https://issues.apache.org/jira/browse/HBASE-15594
>             Project: HBase
>          Issue Type: Umbrella
>            Reporter: stack
>            Priority: Critical
>
> Running YCSB and getting good results is an arcane art. For example, in my testing, a
few handlers (100) with as many readers as I had CPUs (48), and upping connections on clients
to same as #cpus made for 2-3x the throughput. The above config changes came of lore; which
configurations need tweaking is not obvious going by their names, there were no indications
from the app on where/why we were blocked or on which metrics are important to consider. Nor
was any of this stuff written down in docs.
> Even still, I am stuck trying to make use of all of the machine. I am unable to overrun
a server though 8 client nodes trying to beat up a single node (workloadc, all random-read,
with no data returned -p  readallfields=false). There is also a strange phenomenon where if
I add a few machines, rather than 3x the YCSB throughput when 3 nodes in cluster, each machine
instead is doing about 1/3rd.
> This umbrella issue is to host items that improve our defaults and noting how to get
good numbers running YCSB. In particular, I want to be able to saturate a machine.
> Here are the configs I'm currently working with. I've not done the work to figure client-side
if they are optimal (weird is how big a difference client-side changes can make -- need to
fix this). On my 48 cpu machine, I can do about 370k random reads a second from data totally
cached in bucketcache. If I short-circuit the user gets so they don't do any work but return
immediately, I can do 600k ops a second but the CPUs are at 60-70% only. I cannot get them
to go above this. Working on it.
> {code}
> <property>
> <name>
> hbase.ipc.server.read.threadpool.size
> </name>
> <value>48</value>
> </property>
> <property>
> <name>
>     hbase.regionserver.handler.count
> </name>
> <value>100</value>
> </property>
> <property>
> <name>
> hbase.client.ipc.pool.size
> </name>
> <value>100</value>
> </property>
> <property>
> <name>
> hbase.htable.threads.max
> </name>
> <value>48</value>
> </property>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message