commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Neidhart (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (MATH-1154) Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization
Date Mon, 06 Oct 2014 21:31:33 GMT

     [ https://issues.apache.org/jira/browse/MATH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Thomas Neidhart updated MATH-1154:
----------------------------------
    Attachment: MATH-1154.patch

I have attached a proposed patch to address the issue in the following way:

 * the patch updates all inference tests to create distributions with a null rng, which avoid
additional overhead as we will not sample from the created distributions

 * re-open MATH-1124 and discuss on the mailinglist if we go for the proposed change of a
lazy initialization for the distributions or change the default rng from WellXXX to something
else

 * create an additional ticket to address potential performance improvements for the WellXXX
rngs, can not be done before 4.0 though.

> Statistical tests in stat.inference package are very slow due to implicit RandomGenerator
initialization
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1154
>                 URL: https://issues.apache.org/jira/browse/MATH-1154
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.3
>            Reporter: Otmar Ertl
>         Attachments: MATH-1154.patch, math3.patch
>
>
> Some statistical tests defined in the stat.inference package (e.g. BinomialTest or ChiSquareTest)
are unnecessarily very slow (up to a factor 20 slower than necessary). The reason is the implicit
slow initialization of a default (Well19937c) random generator instance each time a test is
performed. The affected tests create some distribution instance in order to use some methods
defined therein. However, they do not use any method for random generation. Nevertheless a
random number generator instance is automatically created when creating a distribution instance,
which is the reason for the serious slowdown. The problem is related to MATH-1124.
> There are following solutions:
> 1) Fix the affected statistical tests by passing a light-weight RandomGenerator implementation
(or even null) to the constructor of the distribution.
> 2) Or use for all distributions a RandomGenerator implementation that uses lazy initialization
to generate the Well19937c instance as late as possible. This would also solve MATH-1124.
> I will attach a patch proposal together with a performance test, that will demonstrate
the speed up after a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message