commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-449) Storeless covariance
Date Sun, 21 Aug 2011 05:07:27 GMT

    [ https://issues.apache.org/jira/browse/MATH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13088317#comment-13088317
] 

Phil Steitz commented on MATH-449:
----------------------------------

Thanks for the patch!

Definitely a useful addition.  Looking carefully at the code, I think the following would
be good:

1. StorelessCovarianceMatrix really corresponds to Covariance.  The current implementations
of Covariance and PearssonsCorrelation are really matrix-valued.  So I think StorelessCovarianceMatrix
should be called StorelsssCovariance and what is now StorelessCovariance should be BivariateStorelessCovariance.
 (Of course, one could argue that it is the current classes that are misnamed.  If people
feel strongly that is the case, we can discuss changing those names and creating bivariate
versions.  In any case, we should be consistent.) 

2. I think StorelessCovariance (the matrix version) should extend Covariance.  This should
work, just omitting array/matrix constructors and overriding getMatrix as it implements it
now.  The advantage of this is that it can then be used, for example, to create a correlation
matrix using the method exposed by PearsonsCorrelation.

3. We need to fill in the missing javadoc.

Thanks again for the patch.  I will take care of the items above if there are no objections
and no one beats me to it.


> Storeless covariance
> --------------------
>
>                 Key: MATH-449
>                 URL: https://issues.apache.org/jira/browse/MATH-449
>             Project: Commons Math
>          Issue Type: Improvement
>            Reporter: Patrick Meyer
>            Assignee: Phil Steitz
>             Fix For: 3.1
>
>         Attachments: MATH-449.patch
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently there is no storeless version for computing the covariance. However, Pebay
(2008) describes algorithms for on-line covariance computations, [http://infoserve.sandia.gov/sand_doc/2008/086212.pdf].
I have provided a simple class for implementing this algorithm. It would be nice to have this
integrated into org.apache.commons.math.stat.correlation.Covariance.
> {code}
> //This code is granted for inclusion in the Apache Commons under the terms of the ASL.
> public class StorelessCovariance{
>     private double deltaX = 0.0;
>     private double deltaY = 0.0;
>     private double meanX = 0.0;
>     private double meanY = 0.0;
>     private double N=0;
>     private Double covarianceNumerator=0.0;
>     private boolean unbiased=true;
>     public Covariance(boolean unbiased){
> 	this.unbiased = unbiased;
>     }
>     public void increment(Double x, Double y){
>         if(x!=null & y!=null){
>             N++;
>             deltaX = x - meanX;
>             deltaY = y - meanY;
>             meanX += deltaX/N;
>             meanY += deltaY/N;
>             covarianceNumerator += ((N-1.0)/N)*deltaX*deltaY;
>         }
>         
>     }
>     public Double getResult(){
>         if(unbiased){
>             return covarianceNumerator/(N-1.0);
>         }else{
>             return covarianceNumerator/N;
>         }
>     }   
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message