commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-449) Storeless covariance
Date Thu, 16 Feb 2012 19:53:00 GMT

    [ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209670#comment-13209670
] 

Thomas Neidhart commented on MATH-449:
--------------------------------------

I had missed some suggestions from Phil at first, and have committed them in r1245133.

The changes include:

* Drop setEntry and incrementCovariance. Rename incrementRow to increment and have that the
only mutator. (/)
* Replace colDimension and rowDimension with just dimension, forcing the matrix to be square.
(/)
* Store only upper triangular BivariateCovariances. (/)
** Add a transpose method to StorelessBivariateCovariance so getEntry returns something that
can be further 
incemented properly. (x)
* Add symmetry tests (/)
* Change getCovariance to return the actual covariance double value instead of the Storelessxxxx
object (/)
* make StorelessBivariateCovariance package private as it is not used outside StorelessCovariance
(/)

Returning the inner StorelessBivariateCovariance elements is dangerous as incrementing them
individually could break the symmetry due to the way they are now stored internally (as upper
triangular matrix). Adding a transpose method to achieve this somehow as Phil described is
at least not obvious to me.

As there seems to be no actual use of the inner elements anyway, this has been dropped so
far.
Do you agree with the changes made so far?
                
> Storeless covariance
> --------------------
>
>                 Key: MATH-449
>                 URL: https://issues.apache.org/jira/browse/MATH-449
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Patrick Meyer
>            Assignee: Thomas Neidhart
>             Fix For: 3.0
>
>         Attachments: MATH-449.patch, MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. However, Pebay
(2008) describes algorithms for on-line covariance computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf].
I have provided a simple class for implementing this algorithm. It would be nice to have this
integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of the ASL.
> public class StorelessCovariance{
>     private double deltaX = 0.0;
>     private double deltaY = 0.0;
>     private double meanX = 0.0;
>     private double meanY = 0.0;
>     private double N=0;
>     private Double covarianceNumerator=0.0;
>     private boolean unbiased=true;
>     public Covariance(boolean unbiased){
> 	this.unbiased = unbiased;
>     }
>     public void increment(Double x, Double y){
>         if(x!=null & y!=null){
>             N++;
>             deltaX = x - meanX;
>             deltaY = y - meanY;
>             meanX += deltaX/N;
>             meanY += deltaY/N;
>             covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
>         }
>         
>     }
>     public Double getResult(){
>         if(unbiased){
>             return covarianceNumerator/(N-1.0);
>         }else{
>             return covarianceNumerator/N;
>         }
>     }   
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message