commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Phil Steitz <phil.ste...@gmail.com>
Subject Re: [math] new pseudo random number generators
Date Mon, 20 Sep 2010 12:40:31 GMT
On 9/20/10 6:30 AM, luc.maisonobe@free.fr wrote:
> Hi all,
>
> As those subscribed to the commit list may have noticed, I have
> added several new pseudo-random number generators to
> commons-math.
>
> I should not have add such a feature without discussing on the
> list before. It was a wrong move and I apologize about it. So I
> would like to start a discussion about this now.

Chalk it up to network latency ;)
>
> Up to 2.1, we have few pseudo random number generators. We have
> an interface RandomGenerator implemented by three classes: -
> JDKRandomGenerator that extends the JDK provided generator -
> AbstractRandomGenerator as a helper for users generators -
> BitStreamGenerator which in turn is extended only by
> MersenneTwister
>
> The JDK provided generator is a simple one that can be used only
> for very simple needs. The Mersenne Twister is on the other hand
> a fast generator with good properties well suited for Monte-Carlo
> simulation. It is equidistributed for generating vectors up to
> dimension 623 and has a huge period: 2^19937 - 1.
>
> Since Mersenne-Twister inception in 1997, some new generators
> have been created, retaining the good properties of Mersenne
> twister but removing some of its (few) drawbacks. The main one is
> that if initialized with a bits pool containing lots of zeroes,
> the pool will take a very long time time to stabilize with a
> roughly balanced number of zeros and ones.
>
> I would like to add such generators (well, I already did but can
> withdraw my commit). The ones I want to add are the WELL
> generators (Well Equidistributed Long period Linear) created by
> Fran├žois Panneton, Pierre L'Ecuyer and Makoto Matsumoto. They are
> described in their 2006 paper: Improved * Long-Period Generators
> Based on Linear Recurrences Modulo 2, ransactions on Mathematical
> Software, 32, 1 (2006) which is available
> at<http://www.iro.umontreal.ca/~lecuyer/myftp/papers/wellrng.pdf>.
>
>  The paper describes several generators from one family. The most
> interesting ones are WELL1024 (a small one), WELL19937 and
> WELL44497 (two large ones with huge periods that are both
> Mersenne primes). The two large ones exist in a basic version
> that is not Maximally Equidistributed (i.e. equidistributed in
> all possible dimensions and number of bits up to some threshold)
> and also in an extended version that add some tempering to the
> output to get an ME generator.
>
> The paper does not put any restriction on the algorithm. The
> reference implementation in C on the other hand is limited to
> non-commercial use only. In my implementation, I did not refer to
> this C implementation at all. I only compiled it on my personal
> computer to generate reference values and copied the values in
> the test cases. So this implementation is completely independent
> (and in fact the code is rather different since I used an
> abstract class for the global algorithm and one derived class for
> each specific generator. I rely on the JVM optimizer to do all
> the inlining and constants simplification that they did manually
> in the reference implementation.

Sounds OK, but we should verify somehow that the algorithm itself is 
not patented.  Does the C source say anything about that?  Might be 
good to ask about this on legal-discuss or ask one of the authors. 
I know this could apply to any of the algorithms we implement, but 
the ones that are >100 yrs old are a little safer ;)

>
> Another difference with the reference implementation is that our
> BitStreamGenerator abstract class requires the generator to
> provide only bits blocks and not double, and it computes the
> double by itself. This computation uses the full IEEE754 width
> for the doubles, i.e. it uses 53 bits. The generators are
> guaranteed to be equidistributed on large vectors of 32 bits (up
> to dimension 1390 for the 44497 version, since 1390 * 32<
> 44497). I guess this should work well also using chunks of 53
> bits (but of course with smaller dimensions), but it is not
> mathematically proven.
>
> So what do other think about this ? Should this really be
> included (in which case I will open a JIRA issue and resolve it
> immediately) or should it be removed from commons-math ? Is the
> independent implementation and the off-line reference data
> generation sufficient to say this code is unencumbered ? Is the
> 53 bits use for double generation a good thing or should we
> reduce it to 32 bits to stay within the proven behavior ?
>

+1 for including assuming all is OK with IP

Phil



> Once again, sorry for this too late discussion, I really should
> have started it before. Luc
>
> ---------------------------------------------------------------------
>
>
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
> For additional commands, e-mail: dev-help@commons.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org


Mime
View raw message