commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] Summary Stats Higher Moments?
Date Tue, 29 Dec 2015 16:19:30 GMT
On 12/28/15 3:55 PM, michael.brzustowicz@gmail.com wrote:
> Hi Phil,
> That would be great! I think adding third, fourth moments and skewness,
> kurtosis would be a very useful addition.
>
> Also, considering the formulas in Pebay, perhaps a method like
>
> void merge(SummaryStatistics ss) {
>     // use Pebay update formulas to merge un-normalized moments of ss with
> "this"
>    // if one is singleton use "update" method instead
> }
>
> could be added to SummaryStatistics in addition to "update". I realize
> AggregateSummaryStatistics takes care of merging 1st,2nd order stats, so
> this may be redundant.

I think AggregateSummaryStatistics takes care of the general case
for this; but IIRC we did at one point talk about adding a bivariate
method like you have above.  Seems a reasonable addition to
SummaryStatistics.
>
> If there is something I can do to help, please let me know.
> -Mike Brzustowicz

What you are most welcome to do is to open a JIRA asking for the
features above and below and attach patches implementing the
features and adding test cases.  Ask here or offlist if you have any
questions about how to work with git, JIRA, maven etc.

Phil
>
> On Wed, Dec 23, 2015 at 5:44 AM, Phil Steitz <phil.steitz@gmail.com> wrote:
>
>> On 12/22/15 9:58 AM, michael.brzustowicz@gmail.com wrote:
>>> Hi,
>>>
>>> I see that
>> org.apache.commons.math3.stat.descriptive.DescriptiveStatistics
>>> uses the singleton update formulas (from Pebay) for calculating
>>> (un-normalized) moments up to the 4th moment. Is there some reason that
>>> org.apache.commons.math3.stat.descriptive.SummaryStatistics excludes both
>>> third and fourth central moments?
>>>
>>> Is it just a matter of computational efficiency, ie.
>> DescriptiveStatistics
>>> calculates moments only when the getter is invoked (and all orders need
>> not
>>> be calculated at once) while the "storeless" SummaryStatistics would need
>>> to calculate all 4 orders at every call to update()?
>> Yes, that is the reason; but it is really more a matter of no one
>> having asked for this feature.  You are correct that the updating
>> formulas make this possible and the nested nature of the moments
>> means that there should not be much cost to adding the third and
>> fourth moments.  I would be happy to review and apply a patch (with
>> tests) adding these.
>>
>> Phil
>>
>>>  Or is there some other
>>> blocker?
>>>
>>> Thanx,
>>> Mike Brzustowicz
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
>> For additional commands, e-mail: user-help@commons.apache.org
>>
>>



---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
For additional commands, e-mail: user-help@commons.apache.org


Mime
View raw message