cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lohfink (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8341) Expose time spent in each thread pool
Date Thu, 20 Nov 2014 05:03:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219017#comment-14219017
] 

Chris Lohfink commented on CASSANDRA-8341:
------------------------------------------

as reference point, I did a [simplistic benchmark|https://gist.github.com/clohfink/9ee94e3767b5d8170220]
(on a 2014 MBP) with JMH of a few scenarios from patch v1:

{code}
Benchmark        Mode  Samples      Score       Error   Units
baseline         avgt       15   8300.492 ±    92.987   ns/op
staticWrap       avgt       15   8438.268 ±   132.177   ns/op
threadlocal      avgt       15   8464.201 ±   161.554   ns/op
wrapped          avgt       15   8424.011 ±   134.407   ns/op
baseline       sample   253749   8197.106 ±    16.157   ns/op
staticWrap     sample   251910   8233.368 ±    14.841   ns/op
threadlocal    sample   244075   8540.737 ±   106.365   ns/op
wrapped        sample   247083   8443.597 ±    99.892   ns/op
baseline           ss       15  49466.667 ±  8039.477      ns
staticWrap         ss       15  44466.667 ±  7697.095      ns
threadlocal        ss       15  52266.667 ± 11679.329      ns
wrapped            ss       15  60000.000 ± 24000.352      ns
{code}

As a v2 I removed the meter, and switched to using currentmiilis instead of nanotime (~1/3
speed).  I switched the normal ThreadPool one to just do everything in overriden execute.
 It wasn't as simple for the SEPExecutor.  Since I didn't want to break the separation between
the JMX SEP and the parent SEP by collecting metrics in the SEPWorker so I think keeping the
threadlocal approach with a before/after execute callback is a little cleaner (match the ThreadExecutor
api).  With [those changes|https://gist.github.com/clohfink/b51eb027c55008377d93]:

{code}
Benchmark        Mode  Samples       Score        Error   Units
baseline         avgt       15    8052.932 ±     92.517   ns/op
staticWrap       avgt       15    8313.957 ±    295.964   ns/op
threadlocal      avgt       15    8553.304 ±    189.656   ns/op
wrapped          avgt       15    8354.060 ±    169.359   ns/op
baseline       sample   252812    8209.038 ±     14.917   ns/op
staticWrap     sample   248648    8356.810 ±     15.974   ns/op
threadlocal    sample   247362    8388.620 ±     17.310   ns/op
wrapped        sample   249784    8338.419 ±     68.509   ns/op
baseline           ss       15   44933.333 ±   8305.190      ns
staticWrap         ss       15   50533.333 ±  13136.009      ns
threadlocal        ss       15  303600.000 ± 964458.694      ns
wrapped            ss       15   50066.667 ±   7300.839      ns
{code}

Problem with using ms precision is that a lot of the time it takes less then a ms to do a
task so it ends up being very below the actual value.  I was hesitant to use getCurrentThreadCpuTime
as a measure, but its a ns collection and much more meaningful then walltime.  Kinda ideal
so I [gave it a shot|https://gist.github.com/clohfink/50e22ab895a7a700661b] - its 1 little
more then 1us cost per task execution:

{code}
Benchmark        Mode  Samples      Score       Error   Units
baseline         avgt       15   8187.818 ±   162.270   ns/op
staticWrap       avgt       15   9585.818 ±   185.490   ns/op
threadlocal      avgt       15   9508.422 ±   166.103   ns/op
wrapped          avgt       15   9380.099 ±   142.239   ns/op
baseline       sample   251564   8259.852 ±    30.677   ns/op
staticWrap     sample   221622   9365.621 ±    19.831   ns/op
threadlocal    sample   218599   9521.815 ±    33.865   ns/op
wrapped        sample   213108   9743.400 ±    24.726   ns/op
baseline           ss       15  49333.333 ±  8663.130      ns
staticWrap         ss       15  45533.333 ±  9543.739      ns
threadlocal        ss       15  56933.333 ± 12742.318      ns
wrapped            ss       15  52800.000 ± 12001.537      ns
{code}

> Expose time spent in each thread pool
> -------------------------------------
>
>                 Key: CASSANDRA-8341
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Chris Lohfink
>            Priority: Minor
>              Labels: metrics
>         Attachments: 8341.patch, 8341v2.txt
>
>
> Can increment a counter with time spent in each queue.  This can provide context on how
much time is spent percentage wise in each stage.  Additionally can be used with littles law
in future if ever want to try to tune the size of the pools.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message