commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From luc.maison...@free.fr
Subject Re: Commons Math vs. Excel stats?
Date Thu, 09 Nov 2006 20:45:55 GMT
Selon Jeff Drew <jeffrdrew@gmail.com>:

> I'm having a weird problem when using the commons math package.  When I run
> statistics using the Commons math, then compare the results to Excel, I get
> different standard deviation and median, but min, max, and count are the
> same.  I'd appreciate any ideas on how Commons Math and Excel differ in
> these calculations.
>
> MEDIAN:  Excel:  468,231   CommonsMath:  485,711
> STD:        Excel:    11,861   CommonsMath:    10,678
>
> The data set is 18,000 integers so I won't include those.  They are mostly 6
> digit numbers.  Here's the code:

This is weird ...

For the median, one way to check what happens is to sort your data in ascending
order set and look at the data at the middle index. If you have an even number
of samples 2k and 468231 is at index k (counting from 1) and 485711 is at index
k+1 then it is a matter of interpretation. If you have an even number of samples
2k+1, then the result MUST BE the value at index k+1 (counting from 1). Could
you check this in both Excel and CommonsMath ?

For the standard deviation, one way to check is to split your data set in two
parts, compute the various moments and combine them afterwards to check.
Unfortunately, I am currently replying to you from a public area and cannot
provide you with the equations for the combination. These equations are based
on the linearity of the expectancy (is this the right english term ?) and the
definition of the variance from the expectancy. If you prefer to wait until
Monday, I can provide  this equations for you.

Luc

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message