cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Lohfink (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7247) Provide top ten most frequent keys per column family
Date Sun, 21 Sep 2014 05:35:34 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14142343#comment-14142343
] 

Chris Lohfink commented on CASSANDRA-7247:
------------------------------------------

Updated to always do it, but I think 2 or 3 are equally viable - its still using executor
to single-thread it for more performant StreamSummary and provide a 1k backlog cap, especially
since im not sure about performance impact of now using the AbstractType.  Instead of using
the DecoratedKey.toString I changed it to use the human readable format from the partitions
type which makes it more useful for debugging.  If keeping this as an always on option I can
add a nodetool command to list them out in a nice format.

> Provide top ten most frequent keys per column family
> ----------------------------------------------------
>
>                 Key: CASSANDRA-7247
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7247
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Chris Lohfink
>            Assignee: Chris Lohfink
>            Priority: Minor
>         Attachments: cassandra-2.1-7247.txt, jconsole.png, patch.txt
>
>
> Since already have the nice addthis stream library, can use it to keep track of most
frequent DecoratedKeys that come through the system using StreamSummaries ([nice explaination|http://boundary.com/blog/2013/05/14/approximate-heavy-hitters-the-spacesaving-algorithm/]).
 Then provide a new metric to access them via JMX.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message