commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [math]: spearman rank cross correlation
Date Wed, 31 Aug 2005 03:03:06 GMT
On 8/29/05, John Gant <> wrote:
> Wondering if my next step is to implement the ranking algorithms in R?

Don't know exactly what you mean here. If you mean implement R's ranking 
algorithms in java, that would be nice.

Please review the results of the unit tests when compared to the excel
> workbook. The spearman cross correlation can be checked against R once
> ranking algorithms are finished, otherwise it must be checked by hand.

Where is the excel workbook?

Sorry about attaching so many files with no obvious explanation. My
> thought process for ranking algorithms is as follows:
> Interface -> RankingAlgorithm, includes the rank(double [] data) method.

Yes, sounds good. 

Concrete Class -> This class implements the
> rank() method by assigning equivalent raw values the same rank without
> incrementing the rank per duplicate raw value.

Also good.

Future Classes: all of R's ranking algorithms.

Also good, that is essentially what Brent was suggesting and I agree these 
will be useful. The only question is package placement (see below).

The SpearmanRankCrossCorrelation takes an instance of RankingAlgorithm
> as a constructor parameter. This allows for a more "pluggable"
> algorithm. If this seems incorrect please reply so that I do not
> implement K-means, or other clustering algorithms who use a distance
> measurement, in the same manner.

That also sounds fine. I have a couple of suggestions, however, about 
package placement and organization of the code. Just as Pearson's R is 
essentially an operation on vectors, so is Spearman's, so it would seem more 
natural to me to define a class that just provides a pairwise correlation of 
two vectors, as SimpleRegression does. Then the matrix-filling version can 
be provided similarly to what you did for Pearson's. The ranking algorithm 
implementations should probably go into something like
o.a.c.m.stat.ranksand I would put the Spearman's implementation, along
with a wrapper for the
Pearson's impl inside SimpleRegression into o.a.c.m.stat.correlation. I am 
interested in others' views on this.

Thanks and sorry for the response latency. I will get the Pearson's stuff 
committed shortly.


> John

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message