cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Benedict (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-6486) Latency Measurement
Date Sun, 15 Dec 2013 02:43:07 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-6486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848518#comment-13848518
] 

Benedict commented on CASSANDRA-6486:
-------------------------------------

Sure. I'm not actually working on it, just considering options, and only intended to attack
this on th side. This can actually all be done in codahale with a custom reservoir, and I
think now I've got a handle on what I'll do, the  implementation of that reservoir should
actually be very easy. So no doubt it will start bugging me soon and I'll implement it in
my down time sometime before the new year.

I do think our current use of codahale is very bad at reporting latency spikes, which is more
of a problem for our project than for others, given how we advertise real time characteristics.

> Latency Measurement
> -------------------
>
>                 Key: CASSANDRA-6486
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6486
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Benedict
>            Priority: Minor
>
> Latency measurement in Cassandra is currently suboptimal. Exactly what the latency measurements
tell you isn't intuitively clear due to their exponentially decaying, but amount to some view
of the latency per (unweighted) operation over the past, approximately, 10 minute period,
with greater weight given to more recent operations. This has some obvious flaws, the most
notable being that due to probabilistic sampling, large outlier events (e.g. GC) can easily
be lost over a multi-minute time horizon, and even when caught are unlikely to appear even
in the 99.9th percentile due to accounting for a tiny fraction of events numerically.
> I'm generally thinking about how we might improve on this, and want to dump my ideas
here for discussion. I think the following things should be targeted:
> 1) Ability to see uniform latency measurements for different time horizons stretching
back from the present, e.g. last 1s, 1m, 1hr and 1day
> 2) Ability to bound the error margin of statistics for all of these intervals
> 3) Protect against losing outlier measurements
> 4) Possibly offer the ability to weight statistics, so that longer latencies are not
underplayed even if they are counted
> 5) Preferably non-blocking, memory efficient, and relatively garbage-free
> (3) and (4) are the trickiest, as a theoretically sound and general approach isn't immediately
obvious. There are a number of possibilities that spring to mind:
> 1) ensure that we have enough sample points that we are probabilistically guaranteed
to not lose them, but over large time horizons this is problematic due to memory constraints,
and it doesn't address (4);
> 2) count large events multiple times (or sub-slices of the events), based on e.g. average
op-rate. I am not a fan of this idea because it makes possibly bad assumptions about behaviour
and doesn't seem very theoretically sound;
> 3) weight the probability of retaining an event by its length. the problem with this
approach is that it ties you into (4) without offering the current view of statistics (i.e.
unweighted operations), and it also doesn't lend itself to efficient implementation
> I'm currently leaning towards a fourth approach, which attempts to hybridise uniform
sampling and histogram behaviour, by separating the sample space into ranges, each some multiple
of the last (say 2x the size). Each range has a uniform sample of events that occured in that
range, plus a count of total events. Ideally the size of the sample will be variable based
on the number of events occurring in any range, but that there will be a lower-bound calculated
to ensure we do not lose events.
> This approach lends itself to all 5 goals above:
> 1) by maintaining the same structure for each time horizon, and uniformly sampling from
all of the directly lower order time horizons to maintain it;
> 2) by imposing minimum sample sizes for each range;
> 3) ditto (2);
> 4) by producing time/frequency-weighted statistics using the samples and counts from
each range;
> 5) with thread-local array-based timers that are synchronised with the global timer once
every minimum period, by the owning thread
> This also extends reasonably nicely the timers I have already written for CASSANDRA-6199,
so some of the work is already done.
> Thoughts / discussion would be welcome, especially if you think I've missed another obvious
approach.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message