Return-Path: Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: (qmail 1727 invoked from network); 7 Nov 2010 14:49:16 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 7 Nov 2010 14:49:16 -0000 Received: (qmail 89067 invoked by uid 500); 7 Nov 2010 14:49:47 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 88843 invoked by uid 500); 7 Nov 2010 14:49:45 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 88835 invoked by uid 99); 7 Nov 2010 14:49:45 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Nov 2010 14:49:45 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.216.43] (HELO mail-qw0-f43.google.com) (209.85.216.43) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 07 Nov 2010 14:49:38 +0000 Received: by qwj9 with SMTP id 9so3797284qwj.30 for ; Sun, 07 Nov 2010 06:49:17 -0800 (PST) Received: by 10.229.86.2 with SMTP id q2mr4047705qcl.188.1289141357042; Sun, 07 Nov 2010 06:49:17 -0800 (PST) MIME-Version: 1.0 Received: by 10.220.65.153 with HTTP; Sun, 7 Nov 2010 06:48:56 -0800 (PST) In-Reply-To: <4CD6BB32.8030608@gmail.com> References: <13237437.161041288507185282.JavaMail.jira@thor> <21457818.54431289059672301.JavaMail.jira@thor> <4CD6B28B.3090504@gmail.com> <4CD6BB32.8030608@gmail.com> From: Mikkel Meyer Andersen Date: Sun, 7 Nov 2010 15:48:56 +0100 Message-ID: Subject: Re: [jira] Commented: (MATH-431) New tests: Wilcoxon signed-rank test and Mann-Whitney U To: issues@commons.apache.org Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org 2010/11/7 Phil Steitz : > On 11/7/10 9:17 AM, Mikkel Meyer Andersen wrote: >> >> 2010/11/7 Phil Steitz: >>> >>> On 11/6/10 12:44 PM, Mikkel Meyer Andersen wrote: >>>> >>>> 2010/11/6 Phil Steitz (JIRA): >>>>> >>>>> =A0 =A0[ >>>>> >>>>> https://issues.apache.org/jira/browse/MATH-431?page=3Dcom.atlassian.j= ira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D129290= 54#action_12929054 >>>>> ] >>>>> >>>>> Phil Steitz commented on MATH-431: >>>>> ---------------------------------- >>>>> >>>>> +1 for including both of these tests. =A0Then on to MATH-228 >>>> >>>> Anything I should do in regard to that? >>> >>> What we need there is a good algorithm for approximating the KS >>> distribution. =A0I have been corresponding with the author of a very go= od >>> one >>> with a Java implementation but have thus far failed in getting consent = to >>> release under ASL. =A0So at this point, I am looking for an alternative >>> good >>> algorithm to implement. =A0All suggestions / unencumbered patches welco= me! >>> >>> See comments on the MATH-431 for other questions. >>> >> Just to be sure of what you mean: >> Do you want to have a two-sample Kolmogorov-Smirnov test for equality >> of distributions in addition to the Mann-Whitney? Or do you need the >> Kolmogorov-Smirnov distribution (as stated for example at >> >> http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test#Kolmogorov_= distribution >> ) in regards to the MATH-428? Sorry, but I'm at bit confused :-). > > The goal is to implement the KS test for equality of distributions (or > homogeneity against a reference distribution). =A0To do that we need at l= east > critical values of the Kolmogorov distribution. =A0The natural way for us= to > do that would be to implement the full distribution which would be nice t= o > have in the distributions package. > > Phil Have you read "Evaluating Kolmogorov=92s Distribution" by Marsaglia et al. available on http://www.jstatsoft.org/v08/i18/paper ? And do you think their approach would be the way to go? >>>>> >>>>> Interesting approach for the exact algorithm for Wilcoxon. =A0If we s= tay >>>>> with this, we should ack the original author of the algorithm in the >>>>> javadoc. =A0Looks OK to use. >>>> >>>> Agree - both on the approach and legal part! Does the author need to >>>> sign anything but write a mail? >>>>> >>>>> =A0Regarding the difference from R, what I usually do in this case is >>>>> look >>>>> at the R sources to try to explain the difference. =A0Most likely in = this >>>>> case, what is going on is they are using a different estimation >>>>> algorithm >>>>> for small n or treating ties differently. =A0The ranking options that= we >>>>> use >>>>> were largely adapted from R, so if that is the problem, it should be >>>>> easy to >>>>> test. =A0We need to convince ourselves that ours is better or at leas= t a >>>>> legitimate alternative. =A0I will take a close look this evening, but= it >>>>> looks >>>>> like the algorithm you are using should be exact. =A0If we can't >>>>> reconcile the >>>>> difference with R, it would be good to find a way to validate correct >>>>> functioning of the algorithm by manufacturing reference data with kno= wn >>>>> p. >>>> >>>> I'll try to investigate the difference, hopefully tomorrow, so that >>>> formal tests can be written and included. >>>>> >>>>>> New tests: Wilcoxon signed-rank test and Mann-Whitney U >>>>>> ------------------------------------------------------- >>>>>> >>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 Key: MATH-431 >>>>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 URL: https://issues.apache.org/jira/= browse/MATH-431 >>>>>> =A0 =A0 =A0 =A0 =A0 =A0 Project: Commons Math >>>>>> =A0 =A0 =A0 =A0 =A0Issue Type: New Feature >>>>>> =A0 =A0 =A0 =A0 =A0 =A0Reporter: Mikkel Meyer Andersen >>>>>> =A0 =A0 =A0 =A0 =A0 =A0Assignee: Mikkel Meyer Andersen >>>>>> =A0 =A0 =A0 =A0 =A0 =A0Priority: Minor >>>>>> =A0 =A0 =A0 =A0 Attachments: MannWhitneyUTest.java, MannWhitneyUTest= Impl.java, >>>>>> WilcoxonSignedRankTest.java, WilcoxonSignedRankTestImpl.java >>>>>> >>>>>> =A0 Original Estimate: 4h >>>>>> =A0Remaining Estimate: 4h >>>>>> >>>>>> Wilcoxon signed-rank test and Mann-Whitney U are commonly used >>>>>> non-parametric statistical hypothesis tests (e.g. instead of various >>>>>> t-tests >>>>>> when normality is not present). >>>>> >>>>> -- >>>>> This message is automatically generated by JIRA. >>>>> - >>>>> You can reply to this email to add a comment to the issue online. >>>>> >>>>> >>> >>> > >