commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <>
Subject Re: [jira] Commented: (MATH-431) New tests: Wilcoxon signed-rank test and Mann-Whitney U
Date Sun, 07 Nov 2010 14:44:02 GMT
On 11/7/10 9:17 AM, Mikkel Meyer Andersen wrote:
> 2010/11/7 Phil Steitz<>:
>> On 11/6/10 12:44 PM, Mikkel Meyer Andersen wrote:
>>> 2010/11/6 Phil Steitz (JIRA)<>:
>>>>     [
>>>> ]
>>>> Phil Steitz commented on MATH-431:
>>>> ----------------------------------
>>>> +1 for including both of these tests.  Then on to MATH-228
>>> Anything I should do in regard to that?
>> What we need there is a good algorithm for approximating the KS
>> distribution.  I have been corresponding with the author of a very good one
>> with a Java implementation but have thus far failed in getting consent to
>> release under ASL.  So at this point, I am looking for an alternative good
>> algorithm to implement.  All suggestions / unencumbered patches welcome!
>> See comments on the MATH-431 for other questions.
> Just to be sure of what you mean:
> Do you want to have a two-sample Kolmogorov-Smirnov test for equality
> of distributions in addition to the Mann-Whitney? Or do you need the
> Kolmogorov-Smirnov distribution (as stated for example at
> ) in regards to the MATH-428? Sorry, but I'm at bit confused :-).

The goal is to implement the KS test for equality of distributions 
(or homogeneity against a reference distribution).  To do that we 
need at least critical values of the Kolmogorov distribution.  The 
natural way for us to do that would be to implement the full 
distribution which would be nice to have in the distributions package.

>>>> Interesting approach for the exact algorithm for Wilcoxon.  If we stay
>>>> with this, we should ack the original author of the algorithm in the
>>>> javadoc.  Looks OK to use.
>>> Agree - both on the approach and legal part! Does the author need to
>>> sign anything but write a mail?
>>>>   Regarding the difference from R, what I usually do in this case is look
>>>> at the R sources to try to explain the difference.  Most likely in this
>>>> case, what is going on is they are using a different estimation algorithm
>>>> for small n or treating ties differently.  The ranking options that we use
>>>> were largely adapted from R, so if that is the problem, it should be easy
>>>> test.  We need to convince ourselves that ours is better or at least a
>>>> legitimate alternative.  I will take a close look this evening, but it looks
>>>> like the algorithm you are using should be exact.  If we can't reconcile
>>>> difference with R, it would be good to find a way to validate correct
>>>> functioning of the algorithm by manufacturing reference data with known p.
>>> I'll try to investigate the difference, hopefully tomorrow, so that
>>> formal tests can be written and included.
>>>>> New tests: Wilcoxon signed-rank test and Mann-Whitney U
>>>>> -------------------------------------------------------
>>>>>                  Key: MATH-431
>>>>>                  URL:
>>>>>              Project: Commons Math
>>>>>           Issue Type: New Feature
>>>>>             Reporter: Mikkel Meyer Andersen
>>>>>             Assignee: Mikkel Meyer Andersen
>>>>>             Priority: Minor
>>>>>          Attachments:,,
>>>>>    Original Estimate: 4h
>>>>>   Remaining Estimate: 4h
>>>>> Wilcoxon signed-rank test and Mann-Whitney U are commonly used
>>>>> non-parametric statistical hypothesis tests (e.g. instead of various
>>>>> when normality is not present).
>>>> --
>>>> This message is automatically generated by JIRA.
>>>> -
>>>> You can reply to this email to add a comment to the issue online.

View raw message