commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Phil Steitz (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-437) Kolmogorov Smirnov Distribution
Date Tue, 26 Mar 2013 22:01:15 GMT

    [ https://issues.apache.org/jira/browse/MATH-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13614601#comment-13614601
] 

Phil Steitz commented on MATH-437:
----------------------------------

I think we should bump this to 4.0 or at least 3.3. It was probably a mistake to put K-S in
the distribution package.  The K-S distribution itself is of little practical usefulness (to
my knowledge at least).  I have never seen it used for anything but performing K-S tests.
 It is tricky enough to compute the distribution function itself with any kind of numerical
stability, as the comments above and the literature around K-S tests confirm.  Computing moments
is, as the reference where Luc (resourcefully!) found test data states, "intractable."  I
think it may be best to steer clear of this and focus on just getting good implementation
of the test itself, which should move to .inference.   I would prefer to do a little more
research though to decide how best to set up the API and implementation for the test.  It
could be we would be better off not using the cdfs in the current impl, instead using beta
approximation to compute p-values as in [1].  Note also that since discussion above / initial
implementation, Simard has published [2] with some empirical findings on how the various K-S
approximation methods perform.

So to summarize, I think the first step is to agree on the K-S test API.  Then deprecate the
class in .distribution and move the test class to .inference.

[1]http://www.ism.ac.jp/editsec/aism/pdf/054_3_0577.pdf
[2] http://www.jstatsoft.org/v39/i11/paper
                
> Kolmogorov Smirnov Distribution
> -------------------------------
>
>                 Key: MATH-437
>                 URL: https://issues.apache.org/jira/browse/MATH-437
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Mikkel Meyer Andersen
>            Assignee: Phil Steitz
>            Priority: Minor
>             Fix For: 3.2
>
>         Attachments: ks-distribution.patch, MATH437-with-test-take-1
>
>   Original Estimate: 0.25h
>  Remaining Estimate: 0.25h
>
> Kolmogorov-Smirnov test (see [1]) is used to test if one sample against a known probability
density functions or if two samples are from the same distribution. To evaluate the test statistic,
the Kolmogorov-Smirnov distribution is used. Quite good asymptotics exist for the one-sided
test, but it's more difficult for the two-sided test.
> [1]: http://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message