commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RNG-55) UnitSphereSampler
Date Wed, 15 Aug 2018 23:08:00 GMT

    [ https://issues.apache.org/jira/browse/RNG-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581723#comment-16581723
] 

Gilles commented on RNG-55:
---------------------------

bq. check for zero length dimension

That would be good.

bq. The user will find out when they call nextVector() with an NegativeArraySizeException
anyway.

No, because zero size is allowed.

bq.  NormalizedGaussianSampler returns 0 

Indeed, the implementation must check for that condition.
If not fulfilled, I'd suggest to recursively call {{nextVector()}}.
I'd guess that the probability of exhausting the stack is vanishingly small.

bq. infinite loop if the NormalizedGaussianSampler always returns 0

With the above fix, it cannot happen (even with a broken RNG since a {{StackOverflowError}}
will occur).


> UnitSphereSampler
> -----------------
>
>                 Key: RNG-55
>                 URL: https://issues.apache.org/jira/browse/RNG-55
>             Project: Commons RNG
>          Issue Type: Bug
>          Components: sampling
>    Affects Versions: 1.1
>            Reporter: Alex D Herbert
>            Priority: Trivial
>
> The {{UnitSphereSampler}} does not check for zero length dimension. This doesn't cause
a fail but maybe should be raised as an error? The user will find out when they call {{nextVector()}}
with an {{NegativeArraySizeException}} anyway.
> However the algorithm can fail under the extremely unlikely condition that the {{NormalizedGaussianSampler}}
returns {{0}} for each dimension. This is more likely when using {{dimension==1}}.
> The returned vector will be {{Double.NaN}} for each dimension due to the use of {{1 /
Math.sqrt(0)}}:
> {code:java}
> // f will be Double.POSITIVE_INFINITY
> final double f = 1 / Math.sqrt(normSq);
> for (int i = 0; i < dimension; i++) {
>     // v will be Double.NaN as 0 * Double.POSITIVE_INFINITY is Double.NaN
>     v[i] *= f;
> }
> {code}
> Here is a test that demonstrates it. All it requires in the RNG to provide a long of
0 to the internal {{ZigguratNormalizedGaussianSampler}}:
> {code:java}
> @Test
> public void testBadRNG() {
>     final UniformRandomProvider rng = new UniformRandomProvider() {
>         public long nextLong(long n) { return 0; }
>         public long nextLong() { return 0; }
>         public int nextInt(int n) { return 0; }
>         public int nextInt() { return 0; }
>         public float nextFloat() { return 0; }
>         public double nextDouble() { return 0;}
>         public void nextBytes(byte[] bytes, int start, int len) {}
>         public void nextBytes(byte[] bytes) {}
>         public boolean nextBoolean() { return false; }
>     };
>     UnitSphereSampler s = new UnitSphereSampler(1, rng);
>     double[] v = s.nextVector();
>     Assert.assertNotNull(v);
>     Assert.assertEquals(1, v.length);
>     Assert.assertTrue(Double.isNaN(v[0]));
> }
> {code}
> This can be fixed by an outer loop:
> {code:java}
> /**
>  * @return a random normalized Cartesian vector.
>  */
> public double[] nextVector() {
>     final double[] v = new double[dimension];
>     // Pick a point by choosing a standard Gaussian for each element,
>     // and then normalize to unit length.
>     double normSq = 0;
>     while (normSq == 0) {
>         for (int i = 0; i < dimension; i++) {
>             final double comp = sampler.sample();
>             v[i] = comp;
>             normSq += comp * comp;
>         }
>     }
>     final double f = 1 / Math.sqrt(normSq);
>     for (int i = 0; i < dimension; i++) {
>         v[i] *= f;
>     }
>     return v;
> }
> {code}
> But this can then infinite loop if the {{NormalizedGaussianSampler}} always returns 0
(as for the dummy test case above).
> Q. What is the lesser evil of a vector with NaN (as with the current implementation)
or never returning in an extreme edge case?
> Throwing an exception would change the API.
> Returning the vector [1,0,0,....] would fix the edge case but not alert the user to a
broken RNG:
> {code:java}
> /**
>  * @return a random normalized Cartesian vector.
>  */
> public double[] nextVector() {
>     final double[] v = new double[dimension];
>     // Pick a point by choosing a standard Gaussian for each element,
>     // and then normalize to unit length.
>     double normSq = 0;
>     for (int i = 0; i < dimension; i++) {
>         final double comp = sampler.sample();
>         v[i] = comp;
>         normSq += comp * comp;
>     }
>     
>     if (normSq == 0) {
>         // Extreme edge case
>         if (dimension != 0)
>             v[0] = 1;
>         return v;
>     }
>     final double f = 1 / Math.sqrt(normSq);
>     for (int i = 0; i < dimension; i++) {
>         v[i] *= f;
>     }
>     return v;
> }
> {code}
> This is all extremely unlikely however I noticed as I was reviewing new classes in V1.1
and saw this problem that I have hit before in a random walk simulation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message