On 05/15/2015 08:10 PM, Luc Maisonobe wrote:
> Le 14/05/2015 01:29, Gilles a écrit :
>> On Wed, 13 May 2015 19:43:09 +0200, Luc Maisonobe wrote:
>>> Le 13/05/2015 11:35, Gilles a écrit :
>>>> Hello.
>>>>
>>>> On Wed, 13 May 2015 11:12:16 +0200, Luc Maisonobe wrote:
>>>>> Hi all,
>>>>>
>>>>> As Jenkins failures appeared again, I restarted work on FastMath.pow.
>>>>> I'll keep you informed as I make progess.
>>>>
>>>> Did someone see those failures anywhere else other than on the
>>>> identified machine used for CI?
>>>
>>> I did not reproduce them myselves, and I guess most of us didn't succeed
>>> either (lets remember last year when Thomas and Sebb tried to fix the
>>> exp function).
>>>
>>> However, the configuration is not a rare one (Ubuntu, Java 7, Sun JVM),
>>> so I fear this bug, even if rare, can hit several people.
>>
>> How much damage can it do, apart from a false positive on a unit test?
>> Isn't it rather the configuration that is faulty, rather than the CM code?
>
> There are at least some (small) problems in the code.
> I did reimplement the method, relying more heavily on bits
> representation to detect special cases. This new implementation also
> delegates all integer powers to pow(double, long), which handle these
> cases directly with high accuracy multiplication and reciprocal.
>
> This showed that our previous implementation was sometimes slightly
> false. For example during one test in the gamma distribution suite, we
> compute 147.2421875^142. Our former implementation returned
> 1.3785386632060381807...e308, and the new implementation returns
> 1.3785386632060391688...e308 whereas the exact result should be
> 1.3785386632060390905...e308. These are denormalized numbers, i.e.
> their mantissas have less than 53 significant bits (here, they have 52
> bits). The mantissas are:
>
> former implementation 0x9E9AA8184555B
> new implementation 0x9E9AA8184555D
> exact result 0x9E9AA8184555C.D7655353C...
>
> So we had more than 1.8 ulp error in this case and now we have 0.16 ulp
> error.
>
> Comparing all the calls done in our test suite for the full library, the
> differences between the implementations are always in the integer power
> cases (which is normal if you look at the code), they are quite rare and
> generally only 1ulp away. The case above with 2ulps difference (from
> 1.8 to +0.2) is rare.
>
> Performances are equivalent to the previous ones, the new code is
> neither faster nor slower. On my 7 years old desktop computer running
> Linux, FastMath.pow takes about 0.69 as much time as StrictMath, as per
> Gilles performance benchmarks.
>
> Two junit tests did not pass anymore, and it took me a long time to
> understand. The errors were really related to those 1 or 2 ulps differences.
>
> The first test was for BOBYQA optimizer, with the DiffPow
> function, which computes:
>
> f(x) = abs(x0)^2 + abs(x1)^4 + abs(x2)^6 +
> abs(x3)^8 +abs(x4)^10 + abs(x5)^12
>
> The last term (power 12) is sometimes 1ulp away from the previous
> implementation (and I think closer to exact result). The optimizer
> does converge to the expected result (which is obviously all xi = 0),
> but the number of iterations is extremelly large and highly dependent
> on the accuracy. With former implementation, it took about 8000
> iterations on my computer, for a limit set to 12000 for this test. With
> the new implementation, I needed to raise the limit to 21000 to avoid
> failing with a max number of iterations error. It really seems this test
> is a stress test for this optimizer. The values computed are really
> small, with denormalized numbers.
>
> The second test was for gamma distribution, corresponding to a shape
> parameter of 142. There was only one evaluation of pow that was
> different, it is the one depicted above. The two ulps difference
> implied the error in the nonoverflow part of the distribution (with x
> from 0 to 150) was between 1 and 6 ulps, with a mean at about 3.54 ulps
> whereas with the previous implementation it was always 0 or 1 ulp with a
> mean slightly below 0.5 ulp. So I had to raise the test tolerance to
> pass. This seems strange to me. I tried to recompute the reference files
> using the maxima script and increasing the accuracy from 64 digits to 96
> digits, but got similar behaviour. The tests for other shape parameters
> do all pass (values 1, 8, 10, 100 and 1000). Only value 142 fails. I
> even don't know why this 142 value was used (if it were 42 I would have
> taken it as a well known humourous reference). Perhaps it was already a
> borderline case.
>
> So here we are. A new implementation is available in the h10builds
> branch. It is as fast as the previous one, is probably slightly more
> accurate for integer powers, but makes two tests fail unless we adjust
> some test parameters or tolerances. The H10 host seem to not trigger the
> JIT optimization bug anymore with this implementation.
>
> What do we do? Do we put this back to the master branch?
+1 to merge the changes back in.
Thomas

To unsubscribe, email: devunsubscribe@commons.apache.org
For additional commands, email: devhelp@commons.apache.org
