commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From venkatesha murthy <venkateshamurth...@gmail.com>
Subject Re: [MATH-1120] Needed opinion about support on variations in percentile calculation
Date Sun, 01 Jun 2014 20:31:05 GMT
I have gone through Wikipedia and R functions to get an understanding.

My idea is to come up with different estimation techniques as strategies
(Enums) and constrction inject during percentile object creation.
The evaluate method could then use this estimation tecnhique to complete
the computation. kth selection, pivoting can be futher encapsulated as
nested classes and be used within EstimationTecnhique Enum.

I have updated the bug 1120 along with a patch for more details. Please let
know your opinions.

Thanks
Venkat.


On Thu, May 22, 2014 at 7:53 AM, venkatesha murthy <
venkateshamurthyts@gmail.com> wrote:

> All,
>
> Agreed and thanks for opinionating..
> I will work through this to get up with a draft design on the same and
> propse for review in some time.
>
> Thanks
> Venkat.
>
> On Thu, May 22, 2014 at 2:27 AM, Phil Steitz <phil.steitz@gmail.com>
> wrote:
>
>>  On 5/21/14, 1:43 PM, Gilles wrote:
>> > On Wed, 21 May 2014 13:16:26 -0700, Phil Steitz wrote:
>> >> On 5/21/14, 12:18 PM, venkatesha murthy wrote:
>> >>> Hi All,
>> >>>
>> >>> The existing Percentile class calculates the percentile based on
>> >>> the
>> >>> quantile position of the array fixed as
>> >>> p * (N+1)/100 for a pth Percentile on an Array of size N.
>> >>> However if we
>> >>> were to add these numbers in MS Excel
>> >>> to calculate the percentile it provides a different result and
>> >>> closely
>> >>> resembeles the formula [p*(N-1)/100]+1.
>> >>>
>> >>> Its imperative at times to match the computations to a standard
>> >>> spreadsheet
>> >>> calculations or to a standard tool;
>> >>
>> >> What is "imperative" is that the implementation matches what the
>> >> documentation says.  We do like to compare our results to other
>> >> packages, though, and to explain differences where they exist.  You
>> >> have basically done that above.
>> >>> which is why i request for allowing the quantile position to be
>> >>> customized.
>> >>
>> >> That is a reasonable request, as there are lots of different ways to
>> >> compute quantiles.
>> >>> Infact even the kth selection used
>> >>> can also be refactored as a strategy(than as a private methods)
>> >>> as a
>> >>> further step.
>> >>
>> >> Agreed.
>> >>>
>> >>> So if atleast the Percentile class were to allow the quantile
>> >>> position to
>> >>> be customized in the sub classes; then
>> >>> the end user may be helped in providing the formula of their
>> >>> choice.
>> >>>
>> >>> The most minimal change i am proposing here is to just make the
>> >>> quantile
>> >>> position setting as a protected method and i have attached a
>> >>> possible patch
>> >>> in [MATH-1120] <https://issues.apache.org/jira/browse/MATH-1120>
>> >>>
>> >>> Request all to opinionate on this
>> >>
>> >> I think that what would be best here would be to really dig into the
>> >> different kinds of algorithms that see practical use and then
>> >> encapsulate a strategy object of some kind that could be passed in
>> >> as an optional constructor argument.  I would start with [1] as a
>> >> reference.  We don't actually have to implement anything but what
>> >> you have immediate need for; but we should design the
>> >> QuantileStrategy (or better name) object so that it can carry the
>> >> right configuration parameters for the different strategies likely
>> >> to be needed.
>> >
>> > Any objection to having a protected method, as the OP suggested?
>>
>> The problem there is that it forces the user to actually subclass
>> and once that is done the behavior is essentially undefined (i.e.,
>> the end user of whatever is created doesn't really have a clearly
>> defined contract unless they rewrite it).   Much better to actually
>> implement - and document - alternatives.
>>
>> That approach also only covers one aspect of the variability in
>> algorithms.
>>
>> Phil
>>  >
>> >
>> > Gilles
>> >
>> >>
>> >> Phil
>> >>
>> >> [1] Hyndman, R. J. and Fan, Y. (1996) Sample quantiles in
>> >> statistical packages, /American Statistician/ *50*, 361–365.
>> >>>
>> >>> thanks
>> >>> venkat
>> >>>
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> > For additional commands, e-mail: dev-help@commons.apache.org
>> >
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message