hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "RJ Nowling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11152) Better random number generator
Date Tue, 16 Dec 2014 02:00:25 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14247594#comment-14247594
] 

RJ Nowling commented on HADOOP-11152:
-------------------------------------

Hi all,

What about quasi-random numbers?

If you generate a large number of samples using a RNG, the set of numbers will be approximately
uniformly distributed. However, if you take a small number of samples (say 50), you would
see that they are not picked uniformly.  E.g., samples may cluster.  RNGs are not necessarily
guaranteed to maintain the uniform nature for small groups of samples.  (Some may do that,
however.)

Quasi-random sequences ensure that the set of numbers generated maintains a uniform distribution
regardless of whether you pick 100 or 10,000.  You may want to read this blog entry by John
D. Cook for an example: http://www.johndcook.com/blog/2009/03/16/quasi-random-sequences-in-art-and-integration/
.

Numerical Recipes discusses algorithms for generating sequences of quasi-random numbers.

> Better random number generator
> ------------------------------
>
>                 Key: HADOOP-11152
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11152
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Luke Lu
>              Labels: newbie++
>
> HDFS-7122 showed that naive ThreadLocal usage of simple LCG based j.u.Random creates
unacceptable distribution of random numbers for block placement. Similarly, ThreadLocalRandom
in java 7 (same static thread local with synchronized methods overridden) has the same problem.

> "Better" is defined as better quality and faster than j.u.Random (which is already much
faster (20x) than SecureRandom).
> People (e.g. Numerical Recipes) have shown that by combining LCG and XORShift we can
have a better fast RNG. It'd be worthwhile to investigate a thread local version of these
"better" RNG.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message