Return-Path: X-Original-To: apmail-commons-issues-archive@minotaur.apache.org Delivered-To: apmail-commons-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 56B2E11261 for ; Thu, 26 Jun 2014 13:17:25 +0000 (UTC) Received: (qmail 17547 invoked by uid 500); 26 Jun 2014 13:17:24 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 17373 invoked by uid 500); 26 Jun 2014 13:17:24 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 17111 invoked by uid 99); 26 Jun 2014 13:17:24 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Jun 2014 13:17:24 +0000 Date: Thu, 26 Jun 2014 13:17:24 +0000 (UTC) From: "Thomas Neidhart (JIRA)" To: issues@commons.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (MATH-1131) Kolmogorov-Smirnov Tests takes 'forever' on 10,000 item dataset MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/MATH-1131?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14044= 641#comment-14044641 ]=20 Thomas Neidhart edited comment on MATH-1131 at 6/26/14 1:16 PM: ---------------------------------------------------------------- My previous comment wrt performance of matrix.power\(n\) was wrong. This is not the limiting factor when using a BlockRealMatrix as the number = of actual matrix multiplications is only log\(n\). The problem when using so large samples is that the matrix elements quickly= grow and lead to NaN computations. The reference code does a special trick= when computing power\(n\): * after every multiplication check if the center element is > 1e140 and if= so divide the whole matrix by this factor. * update the factor each time it is applied to the matrix * after computing power\(n\), the factor is applied in a reverse manner on= the element to be returned. was (Author: tn): My previous comment wrt performance of matrix.power(n) was wrong. This is not the limiting factor when using a BlockRealMatrix as the number = of actual matrix multiplications is only log(n). The problem when using so large samples is that the matrix elements quickly= grow and lead to NaN computations. The reference code does a special trick= when computing power(n): * after every multiplication check if the center element is > 1e140 and if= so divide the whole matrix by this factor. * update the factor each time it is applied to the matrix * after computing power(n), the factor is applied in a reverse manner on t= he element to be returned. > Kolmogorov-Smirnov Tests takes 'forever' on 10,000 item dataset > --------------------------------------------------------------- > > Key: MATH-1131 > URL: https://issues.apache.org/jira/browse/MATH-1131 > Project: Commons Math > Issue Type: Bug > Affects Versions: 3.3 > Environment: Java 8 > Reporter: Schalk W. Cronj=C3=A9 > Attachments: 1.txt, ReproduceKsIssue.groovy, ReproduceKsIssue.jav= a > > > I have code simplified to the following: > KolmogorovSmirnovTest kst =3D new KolmogorovSmirnovTest(); > NormalDistribution nd =3D new NormalDistribution(mean,stddev); > kst.kolmogorovSmirnovTest(nd,dataset) > I find that for my dataset of 10,000 items, the call to kolmogorovSmirnov= Test takes 'forever'. It has not returned after nearly 15minutes and in one= my my tests has gone over 150MB in memory usage.=20 -- This message was sent by Atlassian JIRA (v6.2#6252)