> Copied from the mailing list:
>  Original Message 
> From: "John Gant" <john.gant@gmail.com>
> >
> > Specifically testTwo() in
> > http://issues.apache.org/bugzilla/attachment.cgi?id=3D16172 takes care
> > of data with equal value (ie equal rank), is this the type of
> > situation to which you are referring? Yes I agree, we should implement
> > routines to sort in more diverse ways, but for right now I depend upon
> > Arrays.sort() to perform the sorting.
>=20
> Yes, and in this case the implementation is incorrectly computing the spe=
arman
> correlation as 0.1. But, according to R, the correlation is drastically
> different:
>=20
> > x < c(2.0, 1.0, 3.0, 3.0, 5.0)
> > y < c(4.0, 4.0, 1.0, 2.0, 3.0)
> > cor(x, y, method=3D"spearman")
> [1] 0.631579
>=20
> Thus, I hold the implementation needs to change to correctly rank data wi=
 Hide quoted text 
th
> ties.
Please take a look at http://www.louisville.edu/~jdgant01/SRCC.xls ,
this should agree with the unit tests for
SpearmanRankCrossCorrelation.java (please tell me if you see a
discrepancy). The TiesRankEquivalent.java class is a very generic/simple
implementation, and can be discarded if necessary. From what I have
read, tie ranking is the biggest and most complicated issue with
Spearman Rank Correlation. I will try, along with development on
Kmeans, to implement each of the ranking algorithms that R uses in
its corr() function.
Thanks,
John

To unsubscribe, email: commonsdevunsubscribe@jakarta.apache.org
For additional commands, email: commonsdevhelp@jakarta.apache.org
