commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Neidhart <>
Subject Re: [math] correlation analysis with NaNs
Date Wed, 07 Nov 2012 13:09:06 GMT
On 11/07/2012 01:38 PM, Patrick Meyer wrote:
> You are getting values like 2.5 because of the default ties strategy. If you
> do not want to use that method, create an instance of RankingAlgorithm with
> a different ties strategy and pass it to the constructor for the
> SpearmanCorrelation. This approach also gives you control over the method
> for dealing with NaNs. Something like,
> //create data matrix
> double[] column1 = new double[]{Double.NaN, 1, 2};
> double[] column2 = new double[]{10, 2, 10};
> Array2DRowRealMatrix mydata = new Array2DRowRealMatrix();
> For(int i=0;i<column1.length;i++){
> 	mydata.addToEntry(i, 0, column1[i]);
> 	mydata.addToEntry(i, 1, column2[i]);
> }
> //compute correlation
> NaturalRanking ranking = new NaturalRanking(NaNStrategy.FIXED,
> TiesStrategy.RANDOM);
> SpearmanCorrelation spearman = new SpearmanCorrelation(ranking, mydata);
> Try that.


this will not really help imho.

As far as I can see, there are at least two problems with the current
use of the RankingAlgorithm in the SpearmanCorrelation class:

 * there is no way to select the ranking algorithm in the constructor
   without passing the values at the same time
 * the NaNStrategy.REMOVED does not work symmetrically, i.e. it removes
   the NaN only from the input array where it occurs but not in the
   corresponding array, thus rendering it useless as it will result in
   exceptions (array lengths differ)

Would you be able to create an issue for this on the issue tracker and
provide the test case?



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message