cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lohfink (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Deleted] (CASSANDRA-7247) Provide top ten most frequent keys per column family
Date Tue, 20 May 2014 22:30:41 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Chris Lohfink updated CASSANDRA-7247:
-------------------------------------

    Comment: was deleted

(was: Problem is StreamSummary is not thread safe.  There is a ConcurrentStreamSummary, which
I found in this implementation to be ~4x slower then a synchronized block around the offer
of the non-thread safe one.  Concurrent did perform similarly when also wrapped in synchronized
block which I will show below but because it would lose any benefit of being a concurrent
implementation when access is serialized I think the faster impl is best.

Done on 2013 retina MBP with 500gb ssd against trunk:

{code:title=No Changes}
            id, ops       ,    op/s,   key/s,    mean,     med,     .95,     .99,    .999,
    max,   time,   stderr
 4 threadCount, 634450    ,   21692,   21692,     0.2,     0.2,     0.2,     0.2,     0.4,
  740.1,   29.2,  0.01188
 8 threadCount, 886600    ,   29762,   29762,     0.3,     0.2,     0.3,     0.4,     1.3,
 1007.3,   29.8,  0.01220
16 threadCount, 912050    ,   29035,   29035,     0.5,     0.3,     0.9,     2.5,    11.2,
 1393.8,   31.4,  0.01162
24 threadCount, 1022250   ,   32681,   32681,     0.7,     0.5,     1.0,     2.9,    13.5,
 1126.5,   31.3,  0.00923
36 threadCount, 946550    ,   30900,   30900,     1.2,     0.8,     1.4,     3.0,    22.5,
 1369.2,   30.6,  0.01089
{code}

{code:title=With Patch}
            id, ops       ,    op/s,   key/s,    mean,     med,     .95,     .99,    .999,
    max,   time,   stderr
 4 threadCount, 643900    ,   21700,   21700,     0.2,     0.2,     0.2,     0.2,     0.9,
  941.1,   29.7,  0.01079
 8 threadCount, 942100    ,   32300,   32300,     0.2,     0.2,     0.3,     0.3,     1.2,
  849.5,   29.2,  0.01519
16 threadCount, 907400    ,   30650,   30650,     0.5,     0.3,     0.8,     1.9,    10.7,
 1124.0,   29.6,  0.01112
24 threadCount, 1026150   ,   31753,   31753,     0.7,     0.5,     0.9,     3.3,    20.6,
 1299.0,   32.3,  0.01295
36 threadCount, 980600    ,   30077,   30077,     1.2,     0.8,     1.3,     2.7,    24.9,
 1394.3,   32.6,  0.01747
{code}

{code:title=ConcurrentStreamSummary with sync}
 4 threadCount, 494350    ,   16643,   16643,     0.2,     0.2,     0.3,     0.3,     1.0,
  943.6,   29.7,  0.01286
 8 threadCount, 812950    ,   26358,   26358,     0.3,     0.2,     0.3,     0.5,     1.4,
 1488.9,   30.8,  0.01909
16 threadCount, 877500    ,   27396,   27396,     0.6,     0.3,     1.0,     2.2,    12.1,
 1299.2,   32.0,  0.01824
24 threadCount, 837550    ,   25345,   25345,     0.9,     0.4,     1.2,     3.7,    84.2,
 2123.6,   33.0,  0.02437
36 threadCount, 910200    ,   28008,   28008,     1.3,     0.6,     2.8,     9.2,    32.2,
 1212.8,   32.5,  0.01654
{code}

{code:title=ConcurentStreamSummary no blocking}
            id, ops       ,    op/s,   key/s,    mean,     med,     .95,     .99,    .999,
    max,   time,   stderr
 4 threadCount, 183600    ,    6145,    6145,     0.6,     0.6,     0.8,     1.0,     2.6,
  354.5,   29.9,  0.01063
 8 threadCount, 197200    ,    6593,    6593,     1.2,     1.1,     1.4,     1.8,     3.3,
  413.5,   29.9,  0.00716
16 threadCount, 203200    ,    6794,    6794,     2.3,     2.2,     2.6,     3.5,    12.1,
  649.1,   29.9,  0.01096
24 threadCount, 198000    ,    6615,    6615,     3.6,     3.3,     4.2,     4.9,    44.2,
  570.4,   29.9,  0.00894
36 threadCount, 199800    ,    6627,    6627,     5.4,     4.9,     6.5,     8.0,   110.8,
  272.3,   30.1,  0.01452
{code})

> Provide top ten most frequent keys per column family
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7247
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Minor
>         Attachments: jconsole.png, patch.txt
>
>
> Since already have the nice addthis stream library, can use it to keep track of most
frequent DecoratedKeys that come through the system using StreamSummaries ([nice explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
 Then provide a new metric to access them via JMX.  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message