commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-1154) Statistical tests in stat.inference package are very slow due to implicit RandomGenerator initialization
Date Tue, 07 Oct 2014 13:28:33 GMT

    [ https://issues.apache.org/jira/browse/MATH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14161867#comment-14161867
] 

Gilles commented on MATH-1154:
------------------------------

bq. lazy initialization was discouraged

AFAICT, lazy initialization of the distribution's RNG was considered an unnecessary complication
(of the distributions classes).

The patch seems to provide an elegant solution. It could be construed that the distribution's
RNG is still _not_ lazily initialized, it's the underlying
implementation that is.
An "average" user who trusts the provided default will gain in all cases, and a "power" user
(like you) can still force a "null" RNG for cases where he knows that no sampling will be
requested.


> Statistical tests in stat.inference package are very slow due to implicit RandomGenerator
initialization
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MATH-1154
>                 URL: https://issues.apache.org/jira/browse/MATH-1154
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.3
>            Reporter: Otmar Ertl
>         Attachments: MATH-1154.patch, math3.patch
>
>
> Some statistical tests defined in the stat.inference package (e.g. BinomialTest or ChiSquareTest)
are unnecessarily very slow (up to a factor 20 slower than necessary). The reason is the implicit
slow initialization of a default (Well19937c) random generator instance each time a test is
performed. The affected tests create some distribution instance in order to use some methods
defined therein. However, they do not use any method for random generation. Nevertheless a
random number generator instance is automatically created when creating a distribution instance,
which is the reason for the serious slowdown. The problem is related to MATH-1124.
> There are following solutions:
> 1) Fix the affected statistical tests by passing a light-weight RandomGenerator implementation
(or even null) to the constructor of the distribution.
> 2) Or use for all distributions a RandomGenerator implementation that uses lazy initialization
to generate the Well19937c instance as late as possible. This would also solve MATH-1124.
> I will attach a patch proposal together with a performance test, that will demonstrate
the speed up after a fix.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message