commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: [math] Re: running average of a rate
Date Tue, 15 Mar 2011 00:37:01 GMT
This approach is fine for relatively well-behaved distributions.  Anything
more skewed than, say, an exponential or as long tailed as a t(3)
distribution is likely to have troubles with this approach.

See
http://search-lucene.com/jd/mahout/math/org/apache/mahout/math/stats/OnlineSummarizer.htmlfor
the alternative I have been suggesting.  It can keep accurate
estimates
of any quantile that you like.

On Mon, Mar 14, 2011 at 5:17 PM, sebb <sebbaz@gmail.com> wrote:

>
> In JMeter we needed to display long running percentiles without using
> excess memory, and someone came up with the idea of using buckets for
> ranges of values. So instead of keeping details on each sample elapsed
> time, we increment the count for the appropriate bucket.
>
> If the range of values is too large to use a single bucket for each
> value, each bucket can represent a range of values.
> These ranges can potentially be non-uniform though that does
> complicate the calculations.
>
> JMeter actually uses a TreeMap for the values and counts - the values
> need to be sorted in order to calculate percentiles.
>
> Depending on the data-set, it might be possible to used fixed arrays
> instead of the TreeMap.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message