commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MATH-1300) BitsStreamGenerator#nextBytes(byte[]) is wrong
Date Sun, 20 Dec 2015 12:07:46 GMT

    [ https://issues.apache.org/jira/browse/MATH-1300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065723#comment-15065723
] 

Gilles commented on MATH-1300:
------------------------------

bq. there are unnecessary calls to next(int) in case the chunk size is a multiple of 4

Indeed, that's what Rostislav proposed to solve in the first place.
Depending on the answers to the questions in my first comment, we could make this as a new
feature request.
Or we can explicitly document that consecutive calls to nextBytes won't provide the same sequence
as single call (as per the above unit test).
The feature will work when size is a multiple of 4 (just with the fix that removes the additional
call when not necessary).

Then there was the issue (or not) of performance.

{noformat}
nextBytes (calls per timed block: 200000, timed blocks: 100, time unit: ms)
       name      time/call      std error total time      ratio      difference
CommonsMath 1.21513910e-04 3.20776342e-05 2.4303e+03 1.0000e+00  0.00000000e+00
  Rostislav 1.23101061e-04 2.58350393e-05 2.4620e+03 1.0131e+00  3.17430180e+01
     Thomas 2.14572528e-04 2.55932836e-05 4.2915e+03 1.7658e+00  1.86117237e+03
    Gilles1 1.28583021e-04 1.04224889e-05 2.5717e+03 1.0582e+00  1.41382220e+02
    Gilles2 1.21604685e-04 9.00060879e-06 2.4321e+03 1.0007e+00  1.81551100e+00
{noformat}

{noformat}
nextBytes (calls per timed block: 200000, timed blocks: 100, time unit: ms)
       name      time/call      std error total time      ratio      difference
CommonsMath 1.24257845e-04 2.47466019e-05 2.4852e+03 1.0000e+00  0.00000000e+00
  Rostislav 1.27903613e-04 2.89947446e-05 2.5581e+03 1.0293e+00  7.29153600e+01
     Thomas 2.23244881e-04 4.19456869e-05 4.4649e+03 1.7966e+00  1.97974071e+03
    Gilles1 1.34765909e-04 2.56119543e-05 2.6953e+03 1.0846e+00  2.10161271e+02
    Gilles2 1.28621420e-04 2.35835928e-05 2.5724e+03 1.0351e+00  8.72714880e+01
{noformat}

"Gilles1" is my above proposal (minus the unnecessary mask operation).
"Gilles2" is a variant of Rostislav's code but using static variables rather than hard-coded
numbers.

Based on this not really reliable kind of benchmarks, my position is that we should strive
to make the code
# not use hard-coded numbers
# self-documenting
# simple
# more documented (not at the Javadoc level but explaining the statements)
# safe (calling an overrideable method in a constructor is not - see e.g. {{MersenneTwister}})
# thread-safe
# fast

Thomas' version is the simplest but the performance obviously (?) suffers.
Rostislav's is (often) the fastest, but it is (much) harder to understand, relatively to the
small number of lines.


> BitsStreamGenerator#nextBytes(byte[]) is wrong
> ----------------------------------------------
>
>                 Key: MATH-1300
>                 URL: https://issues.apache.org/jira/browse/MATH-1300
>             Project: Commons Math
>          Issue Type: Bug
>    Affects Versions: 3.5
>            Reporter: Rostislav Krasny
>         Attachments: MersenneTwister2.java, TestMersenneTwister.java
>
>
> Sequential calls to the BitsStreamGenerator#nextBytes(byte[]) must generate the same
sequence of bytes, no matter by chunks of which size it was divided. This is also how java.util.Random#nextBytes(byte[])
works.
> When nextBytes(byte[]) is called with a bytes array of length multiple of 4 it makes
one unneeded call to next(int) method. This is wrong and produces an inconsistent behavior
of classes like MersenneTwister.
> I made a new implementation of the BitsStreamGenerator#nextBytes(byte[]) see attached
code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message