hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Buell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11152) Better random number generator
Date Mon, 29 Sep 2014 19:24:34 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14152122#comment-14152122

Jeff Buell commented on HADOOP-11152:

Others have investigated skew due to random selection, i.e. http://cs.brown.edu/~rfonseca/pubs/ferguson-atc10-poster.pdf.
They got good results from a round-robin scheme. Need to make sure such a scheme works well
in a multi-threaded environment.  In any case, what is needed is a scheme that actively tries
to uniformly spread the copies rather than relying on a completely random process.

> Better random number generator
> ------------------------------
>                 Key: HADOOP-11152
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11152
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Luke Lu
>              Labels: newbie++
> HDFS-7122 showed that naive ThreadLocal usage of simple LCG based j.u.Random creates
unacceptable distribution of random numbers for block placement. Similarly, ThreadLocalRandom
in java 7 (same static thread local with synchronized methods overridden) has the same problem.

> "Better" is defined as better quality and faster than j.u.Random (which is already much
faster (20x) than SecureRandom).
> People (e.g. Numerical Recipes) have shown that by combining LCG and XORShift we can
have a better fast RNG. It'd be worthwhile to investigate a thread local version of these
"better" RNG.

This message was sent by Atlassian JIRA

View raw message