aurora-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David McLaughlin <da...@dmclaughlin.com>
Subject Re: Review Request 53965: track percentile values for timed sliding stats
Date Tue, 22 Nov 2016 18:07:58 GMT

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/53965/#review156595
-----------------------------------------------------------



Thanks for the patch! I think percentiles are crucial for analyzing scheduling performance
and I'm pretty excited about having them for our latencies. However, I'm not sure if it makes
sense to patch our local fork of twitter.commons. It might be better to either:

a) Upstream these changes to twitter.commons (where we might get better feedback on the approach
here) where the tests for the code actually runs, and then copy them over into our fork 

OR

b) Move off twitter-commons stats and use something like util-stats (https://github.com/twitter/util/tree/develop/util-stats)
instead. 


The key advantage of (b) is that we'd be using a library that is very lightweight but has
both percentiles and can support multiple granularities (e.g. for per second metrics for graphview,
but minutely metrics for long term storage) and has a migration strategy (I've done this migration
internally at Twitter myself). 

I'm not sure what the implications of using (b) would be in terms of our dependency graph,
though. AFAIK we forked twitter-commons originally because of that.

- David McLaughlin


On Nov. 22, 2016, 4:35 p.m., Reza Motamedi wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/53965/
> -----------------------------------------------------------
> 
> (Updated Nov. 22, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Joshua Cohen, Mehrdad Nurolahzade, and Santhosh
Kumar Shanmugham.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> The timing of many methods are sampled with @Timed annotation. Currently SlidingSatas
is being used to track total-msec-in-method-x and times-invoked-of-method-x and then compute
the mean-msec-in-method-x-per-invocation. However mean values are known not to represent distrubtions
with high skewness. 
> 
> This is request for review feedback for changes on computing stats, esp. timing stats
by adding percentile.
> 
> This patch introduces `SlidingTimedStats`, which uses `SlidingStats` to track the mean
mean-msec-in-method-x-per-invocation for events in the last K sampling windows and `Percentiles`
to track the p-values for msec-in-method-x for all invocations in the last K windows.
> 
> __changes overview__:
> - Introduced `SlidingTimedStats` to measure percentiles on all @Timed method calls
> - Comments added for a better way to calculate ratio based on two rates (num and denom)
in `SlidingStats`.
> - Tests added
> 
> _sample new stat:_
> attribute_store_fetch_one_10_0_percentile 0.0
> attribute_store_fetch_one_50_0_percentile 0.0
> attribute_store_fetch_one_90_0_percentile 0.0
> attribute_store_fetch_one_99_0_percentile 0.0
> attribute_store_fetch_one_99_9_percentile 0.0
> attribute_store_fetch_one_99_99_percentile 0.0
> 
> 
> Diffs
> -----
> 
>   commons/src/main/java/org/apache/aurora/common/inject/TimedInterceptor.java a4c4240b67cafb108ca587ad86d9fb9d9bb1553d

>   commons/src/main/java/org/apache/aurora/common/stats/SlidingStats.java f7a5ae41e307627fc55157758e9b7cdd861c3268

>   commons/src/main/java/org/apache/aurora/common/stats/SlidingTimedStats.java PRE-CREATION

>   commons/src/test/java/org/apache/aurora/common/stats/SlidingTimedStatsTest.java PRE-CREATION

> 
> Diff: https://reviews.apache.org/r/53965/diff/
> 
> 
> Testing
> -------
> 
> SlidingTimedStatsTest
> > passed
> 
> Seems like gradle is not set to run tests on `common`. Tests passed in IDEA.
> 
> 
> Thanks,
> 
> Reza Motamedi
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message