commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MATH-163) The evaluate method and the getResult method of class Variance give different results
Date Sun, 01 Apr 2007 20:21:32 GMT

    [ https://issues.apache.org/jira/browse/MATH-163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12485920
] 

Phil Steitz commented on MATH-163:
----------------------------------

Thanks for reporting this.  I agree with Rory that the spirit of IEEE754 (which says examine
limit as x -> INF when evaluating expressions involving INF) implies the result of this
computation should be positive infinity in this particular case, as the getResult() method
gives.  For reasons described below, it may be difficult, however, to correctly handle all
INF cases without impacting performance, so I am leaning toward WONTFIX at this point; though
open to suggestions / patches.

The reason that the results of the two methods are different is they use different computing
formulas.  The getResult method is meant to be used when the data is not persisted - i.e.,
after repeatedly calling increment, supplying values in a stream (and updating sums), but
not storing the whole set of values.  It therefore uses a "one pass" algorithm ("West's algorithm",
referenced in javadoc) to compute the variance.  The evaluate method exploits the fact that
it has the full array of values supplied and uses a two-pass method ("corrected two-pass algorithm"
from Chan, Golub, Levesque, Algorithms for Computing the Sample Variance, American Statistician,
August 1983).  These methods may give different results in some examples, with the second
more accurate.  The javadoc should be improved to make this clearer and to recommend that
evaluate  should be preferred over incrementAll-getResult when the full array of values is
available.  That I will do.

> The evaluate method and the getResult method of class Variance give different results
> -------------------------------------------------------------------------------------
>
>                 Key: MATH-163
>                 URL: https://issues.apache.org/jira/browse/MATH-163
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 1.1
>            Reporter: Nele Smeets
>
> Consider the following test code:
>   // construct an array of input values, containing infinity  
>   double[] values = new double[] {1.0, 2.0, Double.POSITIVE_INFINITY};
>   // find the variance using Variance.evaluate(double[])
>   Variance var1 = new Variance();
>   double value1 = var1.evaluate(values);
>   // find the variance using Variance.getResult()
>   Variance var2 = new Variance();
>   var2.incrementAll(values);
>   double value2 = var2.getResult();
>   // print out the results
>   System.out.println(value1);
>   System.out.println(value2);
> This code prints out:
> NaN
> Infinity
> So, we get two different variances, depending on the method we use. 
> (The same is true when we use Double.NEGATIVE_INFINITY as input value instead of Double.POSITIVE_INFINITY.)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message