commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gilles <gil...@harfang.homelinux.org>
Subject Re: [MATH-1120] Needed opinion about support on variations in percentile calculation
Date Wed, 21 May 2014 20:43:39 GMT
On Wed, 21 May 2014 13:16:26 -0700, Phil Steitz wrote:
> On 5/21/14, 12:18 PM, venkatesha murthy wrote:
>> Hi All,
>>
>> The existing Percentile class calculates the percentile based on the
>> quantile position of the array fixed as
>> p * (N+1)/100 for a pth Percentile on an Array of size N. However if 
>> we
>> were to add these numbers in MS Excel
>> to calculate the percentile it provides a different result and 
>> closely
>> resembeles the formula [p*(N-1)/100]+1.
>>
>> Its imperative at times to match the computations to a standard 
>> spreadsheet
>> calculations or to a standard tool;
>
> What is "imperative" is that the implementation matches what the
> documentation says.  We do like to compare our results to other
> packages, though, and to explain differences where they exist.  You
> have basically done that above.
>> which is why i request for allowing the quantile position to be 
>> customized.
>
> That is a reasonable request, as there are lots of different ways to
> compute quantiles.
>> Infact even the kth selection used
>> can also be refactored as a strategy(than as a private methods) as a
>> further step.
>
> Agreed.
>>
>> So if atleast the Percentile class were to allow the quantile 
>> position to
>> be customized in the sub classes; then
>> the end user may be helped in providing the formula of their choice.
>>
>> The most minimal change i am proposing here is to just make the 
>> quantile
>> position setting as a protected method and i have attached a 
>> possible patch
>> in [MATH-1120] <https://issues.apache.org/jira/browse/MATH-1120>
>>
>> Request all to opinionate on this
>
> I think that what would be best here would be to really dig into the
> different kinds of algorithms that see practical use and then
> encapsulate a strategy object of some kind that could be passed in
> as an optional constructor argument.  I would start with [1] as a
> reference.  We don't actually have to implement anything but what
> you have immediate need for; but we should design the
> QuantileStrategy (or better name) object so that it can carry the
> right configuration parameters for the different strategies likely
> to be needed.

Any objection to having a protected method, as the OP suggested?


Gilles

>
> Phil
>
> [1] Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in
> statistical packages, /American Statistician/ *50*, 361–365.
>>
>> thanks
>> venkat
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message