commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Al Chou <hotfusion...@yahoo.com>
Subject Re: cvs commit: jakarta-commons-sandbox/math/src/java/org/apache/commons/math/stat StatUtils.java
Date Wed, 18 Jun 2003 05:49:32 GMT
--- mdiggory@apache.org wrote:
> mdiggory    2003/06/17 20:01:28
> 
>   Modified:    math/src/java/org/apache/commons/math/stat StatUtils.java
>   Log:
>   Adding corrected two-pass algorithm for variance calculation.
>   
>   Revision  Changes    Path
>   1.5       +11 -2    
>
jakarta-commons-sandbox/math/src/java/org/apache/commons/math/stat/StatUtils.java
>   
>   Index: StatUtils.java
>   ===================================================================
>   RCS file:
>
/home/cvs/jakarta-commons-sandbox/math/src/java/org/apache/commons/math/stat/StatUtils.java,v
>   retrieving revision 1.4
>   retrieving revision 1.5
>   diff -u -r1.4 -r1.5
>   --- StatUtils.java	18 Jun 2003 01:56:03 -0000	1.4
>   +++ StatUtils.java	18 Jun 2003 03:01:28 -0000	1.5
>   @@ -155,7 +155,14 @@
>        }
>        
>    	/**
>   -     * Returns the variance of the available values.
>   +     * Returns the variance of the available values. This uses a corrected
>   +     * two pass algorithm of the following 
>   +     * <a href="http://lib-www.lanl.gov/numerical/bookcpdf/c14-1.pdf">
>   +     * corrected two pass formula (14.1.8)</a>, and also referenced
> in:<p/>
>   +     * "Algorithms for Computing the Sample Variance: Analysis and
>   +     * Recommendations", Chan, T.F., Golub, G.H., and LeVeque, R.J. 
>   +     * 1983, American Statistician, vol. 37, pp. 242?247.
>   +     * 
>         * @param values Is a double[] containing the values
>         * @return the result, Double.NaN if no values for an empty array 
>         * or 0.0 for a single value set.  
>   @@ -168,10 +175,12 @@
>    		} else if (values.length > 1) {
>    			double mean = mean(values);
>    			double accum = 0.0;
>   +            double accum2 = 0.0;
>    			for (int i = 0; i < values.length; i++) {
>    				accum += Math.pow((values[i] - mean), 2.0);
>   +                accum2 += (values[i] - mean);
>    			}
>   -			variance = accum / (double)(values.length - 1);
>   +			variance = (accum - (Math.pow(accum2,2)/(double)values.length)) /
> (double)(values.length - 1);

Maybe I'm displaying my old Fortran programmer's bias or premature optimization
without first profiling, but is there a good reason to call Math.pow for the
square in this line rather than do a multiplication?  I can kind of see why you
wouldn't want to introduce a new variable in the "accum +=" line above this
one, but I don't see the need to do a function call for a square on this line.

>    		}
>    		return variance;
>    	}


Al

=====
Albert Davidson Chou

    Get answers to Mac questions at http://www.Mac-Mgrs.org/ .

__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message