commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gilles (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (RNG-55) UnitSphereSampler
Date Thu, 16 Aug 2018 13:30:00 GMT

    [ https://issues.apache.org/jira/browse/RNG-55?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582509#comment-16582509
] 

Gilles commented on RNG-55:
---------------------------

Should be fixed in commit 74c2b58d321f1cd08e3a9b8f1911bada8008a319. Please have a look.

There is also the case where {{normSq}} would be {{Double.POSITIVE_INFINITY}}.  But that would
require {{n}} samples to be of the order of {{sqrt(Double.MAX_VALUE / n)}}.
If the Gaussian implementations could generate such large values (?), they'd be so rare, even
in one dimension, that I don't think it warrants an additional check.

> UnitSphereSampler
> -----------------
>
>                 Key: RNG-55
>                 URL: https://issues.apache.org/jira/browse/RNG-55
>             Project: Commons RNG
>          Issue Type: Bug
>          Components: sampling
>    Affects Versions: 1.1
>            Reporter: Alex D Herbert
>            Priority: Trivial
>
> The {{UnitSphereSampler}} does not check for zero length dimension. This doesn't cause
a fail but maybe should be raised as an error? The user will find out when they call {{nextVector()}}
with an {{NegativeArraySizeException}} anyway.
> However the algorithm can fail under the extremely unlikely condition that the {{NormalizedGaussianSampler}}
returns {{0}} for each dimension. This is more likely when using {{dimension==1}}.
> The returned vector will be {{Double.NaN}} for each dimension due to the use of {{1 /
Math.sqrt(0)}}:
> {code:java}
> // f will be Double.POSITIVE_INFINITY
> final double f = 1 / Math.sqrt(normSq);
> for (int i = 0; i < dimension; i++) {
>     // v will be Double.NaN as 0 * Double.POSITIVE_INFINITY is Double.NaN
>     v[i] *= f;
> }
> {code}
> Here is a test that demonstrates it. All it requires in the RNG to provide a long of
0 to the internal {{ZigguratNormalizedGaussianSampler}}:
> {code:java}
> @Test
> public void testBadRNG() {
>     final UniformRandomProvider rng = new UniformRandomProvider() {
>         public long nextLong(long n) { return 0; }
>         public long nextLong() { return 0; }
>         public int nextInt(int n) { return 0; }
>         public int nextInt() { return 0; }
>         public float nextFloat() { return 0; }
>         public double nextDouble() { return 0;}
>         public void nextBytes(byte[] bytes, int start, int len) {}
>         public void nextBytes(byte[] bytes) {}
>         public boolean nextBoolean() { return false; }
>     };
>     UnitSphereSampler s = new UnitSphereSampler(1, rng);
>     double[] v = s.nextVector();
>     Assert.assertNotNull(v);
>     Assert.assertEquals(1, v.length);
>     Assert.assertTrue(Double.isNaN(v[0]));
> }
> {code}
> This can be fixed by an outer loop:
> {code:java}
> /**
>  * @return a random normalized Cartesian vector.
>  */
> public double[] nextVector() {
>     final double[] v = new double[dimension];
>     // Pick a point by choosing a standard Gaussian for each element,
>     // and then normalize to unit length.
>     double normSq = 0;
>     while (normSq == 0) {
>         for (int i = 0; i < dimension; i++) {
>             final double comp = sampler.sample();
>             v[i] = comp;
>             normSq += comp * comp;
>         }
>     }
>     final double f = 1 / Math.sqrt(normSq);
>     for (int i = 0; i < dimension; i++) {
>         v[i] *= f;
>     }
>     return v;
> }
> {code}
> But this can then infinite loop if the {{NormalizedGaussianSampler}} always returns 0
(as for the dummy test case above).
> Q. What is the lesser evil of a vector with NaN (as with the current implementation)
or never returning in an extreme edge case?
> Throwing an exception would change the API.
> Returning the vector [1,0,0,....] would fix the edge case but not alert the user to a
broken RNG:
> {code:java}
> /**
>  * @return a random normalized Cartesian vector.
>  */
> public double[] nextVector() {
>     final double[] v = new double[dimension];
>     // Pick a point by choosing a standard Gaussian for each element,
>     // and then normalize to unit length.
>     double normSq = 0;
>     for (int i = 0; i < dimension; i++) {
>         final double comp = sampler.sample();
>         v[i] = comp;
>         normSq += comp * comp;
>     }
>     
>     if (normSq == 0) {
>         // Extreme edge case
>         if (dimension != 0)
>             v[0] = 1;
>         return v;
>     }
>     final double f = 1 / Math.sqrt(normSq);
>     for (int i = 0; i < dimension; i++) {
>         v[i] *= f;
>     }
>     return v;
> }
> {code}
> This is all extremely unlikely however I noticed as I was reviewing new classes in V1.1
and saw this problem that I have hit before in a random walk simulation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message