Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Thu, 20 Nov 2014 05:03:34 +0000 (UTC)
From: "Chris Lohfink (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12756446.1416414970000.541848.1416459814605@Atlassian.JIRA>
In-Reply-To: <JIRA.12756446.1416414970000@Atlassian.JIRA>
References: <JIRA.12756446.1416414970000@Atlassian.JIRA>
 <JIRA.12756446.1416414970387@arcas>
Subject: [jira] [Commented] (CASSANDRA-8341) Expose time spent in each
 thread pool
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/CASSANDRA-8341?page=3Dcom.atlas=
sian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D=
14219017#comment-14219017 ]=20

Chris Lohfink commented on CASSANDRA-8341:
------------------------------------------

as reference point, I did a [simplistic benchmark|https://gist.github.com/c=
lohfink/9ee94e3767b5d8170220] (on a 2014 MBP) with JMH of a few scenarios f=
rom patch v1:

{code}
Benchmark        Mode  Samples      Score       Error   Units
baseline         avgt       15   8300.492 =C2=B1    92.987   ns/op
staticWrap       avgt       15   8438.268 =C2=B1   132.177   ns/op
threadlocal      avgt       15   8464.201 =C2=B1   161.554   ns/op
wrapped          avgt       15   8424.011 =C2=B1   134.407   ns/op
baseline       sample   253749   8197.106 =C2=B1    16.157   ns/op
staticWrap     sample   251910   8233.368 =C2=B1    14.841   ns/op
threadlocal    sample   244075   8540.737 =C2=B1   106.365   ns/op
wrapped        sample   247083   8443.597 =C2=B1    99.892   ns/op
baseline           ss       15  49466.667 =C2=B1  8039.477      ns
staticWrap         ss       15  44466.667 =C2=B1  7697.095      ns
threadlocal        ss       15  52266.667 =C2=B1 11679.329      ns
wrapped            ss       15  60000.000 =C2=B1 24000.352      ns
{code}

As a v2 I removed the meter, and switched to using currentmiilis instead of=
 nanotime (~1/3 speed).  I switched the normal ThreadPool one to just do ev=
erything in overriden execute.  It wasn't as simple for the SEPExecutor.  S=
ince I didn't want to break the separation between the JMX SEP and the pare=
nt SEP by collecting metrics in the SEPWorker so I think keeping the thread=
local approach with a before/after execute callback is a little cleaner (ma=
tch the ThreadExecutor api).  With [those changes|https://gist.github.com/c=
lohfink/b51eb027c55008377d93]:

{code}
Benchmark        Mode  Samples       Score        Error   Units
baseline         avgt       15    8052.932 =C2=B1     92.517   ns/op
staticWrap       avgt       15    8313.957 =C2=B1    295.964   ns/op
threadlocal      avgt       15    8553.304 =C2=B1    189.656   ns/op
wrapped          avgt       15    8354.060 =C2=B1    169.359   ns/op
baseline       sample   252812    8209.038 =C2=B1     14.917   ns/op
staticWrap     sample   248648    8356.810 =C2=B1     15.974   ns/op
threadlocal    sample   247362    8388.620 =C2=B1     17.310   ns/op
wrapped        sample   249784    8338.419 =C2=B1     68.509   ns/op
baseline           ss       15   44933.333 =C2=B1   8305.190      ns
staticWrap         ss       15   50533.333 =C2=B1  13136.009      ns
threadlocal        ss       15  303600.000 =C2=B1 964458.694      ns
wrapped            ss       15   50066.667 =C2=B1   7300.839      ns
{code}

Problem with using ms precision is that a lot of the time it takes less the=
n a ms to do a task so it ends up being very below the actual value.  I was=
 hesitant to use getCurrentThreadCpuTime as a measure, but its a ns collect=
ion and much more meaningful then walltime.  Kinda ideal so I [gave it a sh=
ot|https://gist.github.com/clohfink/50e22ab895a7a700661b] - its 1 little mo=
re then 1us cost per task execution:

{code}
Benchmark        Mode  Samples      Score       Error   Units
baseline         avgt       15   8187.818 =C2=B1   162.270   ns/op
staticWrap       avgt       15   9585.818 =C2=B1   185.490   ns/op
threadlocal      avgt       15   9508.422 =C2=B1   166.103   ns/op
wrapped          avgt       15   9380.099 =C2=B1   142.239   ns/op
baseline       sample   251564   8259.852 =C2=B1    30.677   ns/op
staticWrap     sample   221622   9365.621 =C2=B1    19.831   ns/op
threadlocal    sample   218599   9521.815 =C2=B1    33.865   ns/op
wrapped        sample   213108   9743.400 =C2=B1    24.726   ns/op
baseline           ss       15  49333.333 =C2=B1  8663.130      ns
staticWrap         ss       15  45533.333 =C2=B1  9543.739      ns
threadlocal        ss       15  56933.333 =C2=B1 12742.318      ns
wrapped            ss       15  52800.000 =C2=B1 12001.537      ns
{code}

> Expose time spent in each thread pool
> -------------------------------------
>
>                 Key: CASSANDRA-8341
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8341
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Chris Lohfink
>            Priority: Minor
>              Labels: metrics
>         Attachments: 8341.patch, 8341v2.txt
>
>
> Can increment a counter with time spent in each queue.  This can provide =
context on how much time is spent percentage wise in each stage.  Additiona=
lly can be used with littles law in future if ever want to try to tune the =
size of the pools.


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)