cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ariel Weisberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11752) histograms/metrics in 2.2 do not appear recency biased
Date Thu, 12 May 2016 13:48:13 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15281528#comment-15281528
] 

Ariel Weisberg commented on CASSANDRA-11752:
--------------------------------------------

Maybe the server should provide the raw histogram, but for the percentiles provide a value
that is for a recent window of time. IOW do the work of munging the histogram for the monitoring
programs instead of forcing them to provide an integration for munging EstimatedHistogram.
This would also make it less of a change in behavior for people who are upgrading.

I think that a percentile that is based on a window of time going back to when the server
was started is an inappropriate metric for what JMX is/should be used for. Providing it has
0 value so lets use that API for something more useful and similar to what existed before.

> histograms/metrics in 2.2 do not appear recency biased
> ------------------------------------------------------
>
>                 Key: CASSANDRA-11752
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11752
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Chris Burroughs
>              Labels: metrics
>         Attachments: boost-metrics.png, c-jconsole-comparison.png, c-metrics.png, default-histogram.png
>
>
> In addition to upgrading to metrics3, CASSANDRA-5657 switched to using  a custom histogram
implementation.  After upgrading to Cassandra 2.2 histograms/timer metrics are not suspiciously
flat.  To be useful for graphing and alerting metrics need to be biased towards recent events.
> I have attached images that I think illustrate this.
>  * The first two are a comparison between latency observed by a C* 2.2 (us) cluster shoring
very flat lines and a client (using metrics 2.2.0, ms) showing server performance problems.
 We can't rule out with total certainty that something else isn't the cause (that's why we
measure from both the client & server) but they very rarely disagree.
>  * The 3rd image compares jconsole viewing of metrics on a 2.2 and 2.1 cluster over several
minutes.  Not a single digit changed on the 2.2 cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message