lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Greg Bowyer (JIRA)" <>
Subject [jira] [Commented] (SOLR-3673) Random variate functions
Date Wed, 25 Jul 2012 06:15:35 GMT


Greg Bowyer commented on SOLR-3673:

This is where my total ignorance of these random generators and how they use comes in: it
looked to me like these generators in your patch just took in a java.util.Random as input
– is there a particular reason why this Mrs. Twister random needs to be used? what does
that give us that java.util.Random doesn't?

They can take anything that extends java.util.Random, the only issue that exists with the
inbuilt one is that its chance of repeating itself is outstandingly low, it has some properties
with the numbers it generates that make it generate that are statistically poor and its slightly

I dont lay claim to being an expert on this stuff, I am going on what I have been told, the
usage of MT is a side benefit of cheating on the distributions and using the ones that come
out of the box in uncommons-math - since I had a better RNG available I used it 

FWIW: 128bits isn't that much if you let the seed argument to the function be an arbitrary
String - even if you ignore the high bits the user just needs to give you 16 chars (less if
we include stuff like the index version)

Yeah its not a lot and manageable, I was more thinking about avoiding it being too configurable

This is kind of where my "use case" question comes into play as well ... if the goal is just
to use these generators to get a "biased" shuffling of the docs (ie: maybe you use certain
random distribution and then frange filter on it get a set of documents with a roughly predictable
size) then it's not that bad if the seeds aren't very complex – throw in the SolrCore start
time to get a few more bits, etc.... But if there is some sort of cryptography goal then obviously
having a "good" random seed that is unpredictable is a lot more important.

The first use case, also use cases involving bending things towards distributions to act as
cheap models. 

This stuff is useless as it stands for crypto anyhow since these RNG's are fairly predictable.
> Random variate functions
> ------------------------
>                 Key: SOLR-3673
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>    Affects Versions: 4.0, 5.0
>            Reporter: Greg Bowyer
>            Assignee: Greg Bowyer
>         Attachments: SOLR-3673.patch
> Hi all
> At my $DAYJOB I have been asked to build a few random variate functions that return random
numbers bound to a distribution.
> I think these can be added to solr.
> I have a hesitation in that the code as written uses / needs uncommons math (because
we want a far better RNG than java's and because I am lazy and did not want to write distributions)
> uncommons math is apache license so we are good on that front
> anyone have any thoughts on this ?
> For reference the functions are:
> rgaussian(mean, stddev) -> Random value aligned to gaussian distribution
> rpoisson(mean) -> Random value aligned to poisson distribution
> rbinomial(n, prob) -> Random value aligned to binomial distribtion
> rcontinous(min ,max) -> random continuous value between min and max
> rdiscrete(min, max) -> Random discrete value between min and max
> rexponential(rate) -> Random value from the exponential distribution

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message