commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [jira] [Commented] (MATH-1131) Kolmogorov-Smirnov Tests takes 'forever' on 10,000 item dataset
Date Wed, 25 Jun 2014 20:09:28 GMT
Sorry for responding to the list but I have only mobile atm .  IIRC the roundedK method should
not be creating matrices of BigFractions, but rather using doubles. 

> On Jun 25, 2014, at 11:16 AM, "Thomas Neidhart (JIRA)" <jira@apache.org> wrote:
> 
> 
>    [ https://issues.apache.org/jira/browse/MATH-1131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043857#comment-14043857
] 
> 
> Thomas Neidhart commented on MATH-1131:
> ---------------------------------------
> 
> I did briefly debug the example and indeed the calculation hangs when calling roundedK,
or more precisely in createH.
> 
> There powers of BigFraction objects are created with really big numerators and denominators.
Some of the calculations later on take then forever because of this, e.g. when internally
calculating the gcd.
> 
> Looking at the implementation from the referenced paper, there the H values are computed
with double precision. Was there a specific reason to use BigFraction in our implementation?
Is there a specific need for that level of accuracy for the Kolmogorov-Smirnov Test? The other
inference tests do not seem to be so stringent.
> 
> It looks like there is no easy way to limit the maxDenominator when calling multiply()
as it is possible when creating a BigFraction object.
> 
> 
>> Kolmogorov-Smirnov Tests takes 'forever' on 10,000 item dataset
>> ---------------------------------------------------------------
>> 
>>                Key: MATH-1131
>>                URL: https://issues.apache.org/jira/browse/MATH-1131
>>            Project: Commons Math
>>         Issue Type: Bug
>>   Affects Versions: 3.3
>>        Environment: Java 8
>>           Reporter: Schalk W. Cronjé
>>        Attachments: 1.txt, ReproduceKsIssue.groovy, ReproduceKsIssue.java
>> 
>> 
>> I have code simplified to the following:
>>    KolmogorovSmirnovTest kst = new KolmogorovSmirnovTest();
>>    NormalDistribution nd = new NormalDistribution(mean,stddev);
>>    kst.kolmogorovSmirnovTest(nd,dataset)
>> I find that for my dataset of 10,000 items, the call to kolmogorovSmirnovTest takes
'forever'. It has not returned after nearly 15minutes and in one my my tests has gone over
150MB in  memory usage.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message