commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Nix (JIRA)" <>
Subject [jira] [Commented] (MATH-582) Percentile does not work as described in API
Date Wed, 08 Jun 2011 23:07:58 GMT


Christopher Nix commented on MATH-582:

I believe the implementation of percentiles within the library is in accordance with the NIST
definition of percentiles.  To address your examples separately:

1.  What is missing from the API in the description of the implementation is "If pos <
1 then return the smallest element in the array".  As such, the value of 0.0 returned in your
first example is indeed correct for this implementation.

2.  In this definition of percentiles, the value of pos is a position in the array to be interpolated,
but with array indices starting with 1. So with pos = 1.25, the value returned is correctly
a quarter between the 1st and 2nd array values.

Percentiles do not meet intuition well when working with small datasets.  Other definitions,
for example one with pos = 1+p*(n-1)/100 (like in MS Excel), may meet your requirement better
in the above datasets, but not so well with medium ones.  With large datasets, the two definitions

Hope this helps,

Chris N

> Percentile does not work as described in API
> --------------------------------------------
>                 Key: MATH-582
>                 URL:
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 2.2
>            Reporter: Andre Herbst
> example call:
> StatUtils.percentile(new double[]{0d, 1d}, 25)   returns 0.0
> The API says that there is a position being computed:  p*(n+1)/100 -> we have p=25
and n=2
> I would expect position 0.75 as result. Next step according to the API is: interpolation
between both values at floor(0.25) and at ceil(0.25). Those values are 0d and 1d ... so lower
+ d * (upper - lower) should give 0d + 0.25*(1d - 0d) = 0.25
> But the above call returns 0 as result. This does not make sense to me.
> another example where I think the result is not correct:
> StatUtils.percentile(new double[]{0d, 1d, 1d, 1d}, 25)   returns 0.25
> we have pos = 25*5/100 = 1.25  ... so d = 0.25
> values at position floor(1.25) and ceil(1.25) are 1d and 1d. How comes that the result
is not between 1d?

This message is automatically generated by JIRA.
For more information on JIRA, see:

View raw message