commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matt Adereth (JIRA)" <>
Subject [jira] [Commented] (MATH-814) Kendalls Tau Implementation
Date Tue, 29 Oct 2013 23:52:26 GMT


Matt Adereth commented on MATH-814:

Apologies in advance for not being consistent with SpearmansCorrelation.  I couldn't bring
myself to make these methods non-static or to use the approach of having the constructor take
the data.  If this is a problem, I have no issue making the requisite change.

Another consideration that I didn't do, but would be willing to try, would be to make a new
Correlation interface that exposes the methods that should be common to Spearmans, Kendalls,
and Pearsons.  It would probably make sense to then have an AbstractCorrelation base class
that handles the sheparding of data between Matrix, double[][], and double[], double[].

Finally, if you do think a Correlation interface makes sense, I'd also like to propose a NonParametricCorrelation
interface for Spearmans and Kendalls which would have an additional method for computing the
Correlation between two List<Comparable> objects.

> Kendalls Tau Implementation
> ---------------------------
>                 Key: MATH-814
>                 URL:
>             Project: Commons Math
>          Issue Type: New Feature
>    Affects Versions: 4.0
>         Environment: All
>            Reporter: devl
>            Assignee: Phil Steitz
>              Labels: correlation, rank
>             Fix For: 4.0
>         Attachments: kendalls-tau.patch
>   Original Estimate: 840h
>  Remaining Estimate: 840h
> Implement the Kendall's Tau which is a measure of Association/Correlation between ranked
ordinal data.
> A basic description is available at
however the test implementation will follow that defined by "Handbook of Parametric and Nonparametric
Statistical Procedures, Fifth Edition, Page 1393 Test 30, ISBN-10: 1439858012 | ISBN-13: 978-1439858011."
> The algorithm is proposed as follows. 
> Given two rankings or permutations represented by a 2D matrix; columns indicate rankings
(e.g. by an individual) and row are observations of each rank. The algorithm is to calculate
the total number of concordant pairs of ranks (between columns), discordant pairs of ranks
 (between columns) and calculate the Tau defined as
> tau= (Number of concordant - number of discordant)/(n(n-1)/2)
>  where n(n-1)/2 is the total number of possible pairs of ranks.
> The method will then output the tau value between -1 and 1 where 1 signifies a "perfect"
correlation between the two ranked lists. 
> Where ties exist within a ranking it is marked as neither concordant nor discordant in
the calculation. An optional merge sort can be used to speed up the implementation. Details
are in the wiki page.
> Although this implementation is not particularly complex it would be useful to have it
in a consistent format in the commons math package in addition to existing correlation tests.
Kendall's Tau is used effectively in comparing ranks for products, rankings from search engines
or measurements from engineering equipment.

This message was sent by Atlassian JIRA

View raw message