commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-1405) Kolmogorov-Smirnov fixTies can set minDelta too small for jiggler to have significant effect
Date Tue, 28 Feb 2017 23:39:45 GMT

    [ https://issues.apache.org/jira/browse/MATH-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15889124#comment-15889124
] 

Gilles commented on MATH-1405:
------------------------------

Thanks a lot for the report.

Could you create patches against the "master" branch? I.e.
* set up the example as a Junit test case (and check that it fails with the development version
of the library),
* insert your proposed fix (see below)

By "temporary solution", do you mean that there could be problem in adopting it?

I'd suggest to instantiate the RNG outside the loop, as follows (see http://commons.apache.org/rng
and the distribution class in the development version of Commons Math)
{code}
import org.apache.commons.rng.UniformRandomProvider;
import org.apache.commons.rng.simple.RandomSource;
// ...
final UniformRandomProvider rng = RandomSource.create(RandomSource.TWO_CMRES);
do {
      final RealDistribution.Sampler sampler =
          new UniformRealDistribution(-minDelta, minDelta).createSampler(rng);
      jitter(x, sampler);
      jitter(y, sampler);
      ties = hasTies(x, y);
      ct++;
      minDelta *= 2;
} while (ties && ct < 1000);
{code}

Since {{jitter}} is private and used for this single purpose, it would probably be better,
performance-wise, to modify it to use the RNG directly; its signature would thus become:
{code}
private static void jitter(double[] data, UniformRandomProvider rng, double delta) {
    for (int i = 0; i < data.length; i++) {
        final double d = delta * (2 * rng.nextDouble() - 1);
        data[i] += d;
    }
}
{code}


> Kolmogorov-Smirnov fixTies can set minDelta too small for jiggler to have significant
effect
> --------------------------------------------------------------------------------------------
>
>                 Key: MATH-1405
>                 URL: https://issues.apache.org/jira/browse/MATH-1405
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.6.1
>            Reporter: Daniil Finkel
>
> For samples that do not exceed LARGE_SAMPLE_PRODUCT with their product and relatively
large values, a minDelta can be calculated in fixTies() that is too small to have any effect
on the "tied" values. This results in a MathInternalError, as the jiggling with the ineffective
minDelta fails to fix the ties.
> The following arrays exhibit this behavior when run with kolmogorovSmirnovTest(x, y)
in 3.6.1
> x = [1.3750969645841487, 1.0845460746754014, 1.3693352427126644, 1.329688765445783, 1.3392109491039106,
1.3532766470312723, 1.3187287426697727, 1.386273031970554, 1.3416950149276097, 1.0510872606482404,
1.3532766470312723, 1.3075923871137798, 1.3862730319705543, 1.3814421433922548, 1.0527927570919202,
1.3847314864464313, 1.319362658529506, 1.3579238253227275, 1.2455452272301641, 1.329688765445783,
1.3827781646781876, 1.0755168081687903, 1.2566273460024566, 1.3099622795250825, 1.357440924560318,
1.3519397370266515, 1.0927347979524134, 1.3566357346921618, 1.238800036669969, 1.2931730628634528,
1.048463407884969, 1.3779471642491719, 1.2978533797116658, 1.376230881554943, 1.166901202345226,
1.3690425182006263, 1.166901202345226, 1.2953476417603207, 1.0827945761165951, 1.2942406680885112,
1.224414840377028, 1.3910905417259205, 1.303231085263425, 1.348635183816037, 1.3750969645841487,
1.049648651501274, 1.3119534979602083, 1.0446033225080773, 1.0494686631294756, 1.3862026705844126,
1.2719496963348844, 1.3489938748102903, 1.3780468374004164, 1.3884878389662338, 1.3352682241994538,
1.3348722240568909, 1.3921944407986777, 1.0476833161122294, 1.0845460746754008, 1.344165352323966,
1.298548179079665, 1.1979240079667628, 1.3539078973394736, 1.3187287426697725, 1.082794576116595,
1.3779471642491719, 1.3771347858434184, 1.3921944407986777, 1.193793081523992, 1.362050393265006,
1.076638744462226, 1.3551174562135766, 1.3393693468578751, 1.2470361076952952, 1.3696023478216113,
1.3750969645841487, 1.2964734722088322, 1.2953476417603207, 1.2470361076952952, 1.382661263313539,
1.3862026705844126, 1.3771240109822156, 1.25443884328785, 1.3136690818105938, 1.3853832858443051,
1.3486351838160378, 1.348026557887345, 1.0604869883721861, 1.3352682241994536, 1.3480480718535308,
1.3363233390543028, 1.154658436584056, 1.3921944407986775, 1.1979240079667626, 1.3620503932650059,
1.0881358731694244, 1.369042518200626, 1.3532766470312723, 1.2890012831575908, 1.3735565244300663]
> and
> y = [1.1262991662205104, 1.3136690818105938, 1.0446033225080773, 1.3551174562135764,
1.3032310852634252, 1.3806258468851462, 1.2270612333345983, 1.2719496963348844, 1.3601566259413194,
1.3756888280688913, 1.3475322202511097, 1.1937930815239919, 1.0510872606482404, 1.3441653523239654,
1.359738761905118, 1.3382152957887032, 1.0766387444622263, 1.1937930815239919, 1.0820779503060238,
1.1448104521200428, 1.3853832858443051, 1.28757746537949, 1.298548179079665, 1.067255392172351,
1.3168701741293156, 1.3910905417259205, 1.2908594990421354, 1.3750969645841487, 1.329688765445783,
1.386649365275275, 1.285486511663053, 1.2566273460024566, 1.323664826995234, 1.3862730319705538,
1.049346328049449]
> which produce minDelta = 1.11022302462516E-016



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message