Phil Steitz wrote:
> Since xbar = sum/n, the change has no impact on the which sums are
> computed or squared. Instead of (sum/n)*(sum/n)*n your change just
> computes sum**2/n. The difference is that you are a) eliminating one
> division by n and one multiplication by n (no doubt a good thing) and b)
> replacing direct multiplication with pow(,2). The second of these used
> to be discouraged, but I doubt it makes any difference with modern
> compilers. I would suggest collapsing the denominators and doing just
> one cast  i.e., use
>
> (1) variance = sumsq  sum * (sum/(double) (n * (n  1)))
>
> If
>
> (2) variance = sumsq  (sum * sum)/(double) (n * (n  1))) or
>
> (3) variance = sumsq  Math.pow(sum,2)/(double) (n * (n  1))) give
>
> better accuracy, use one of them; but I would favor (1) since it will be
> able to handle larger positive sums.
>
> I would also recommend forcing getVariance() to return 0 if the result
> is negative (which can happen in the right circumstances for any of
> these formulas).
>
> Phil
collapsing is definitely good, but I'm not sure about these equations, from my
experience, approaching (2) would look something more like
variance = (((double)n)*sumsq  (sum * sum)) / (double) (n * (n  1));
see (5) in http://mathworld.wolfram.com/kStatistic.html
As you've stated, this approach seems to have more than just one benifit. I'll
also place in a test for negitive values and return 0.0 if they are present.
Mark

To unsubscribe, email: commonsdevunsubscribe@jakarta.apache.org
For additional commands, email: commonsdevhelp@jakarta.apache.org
