commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Neidhart <thomas.neidh...@gmail.com>
Subject Re: [MATH] Jenkins unit test failure
Date Tue, 13 Jan 2015 10:08:20 GMT
jacoco is not the cause as the tests still fail when it is disabled.

Thomas

On Tue, Jan 13, 2015 at 10:52 AM, Thomas Neidhart <thomas.neidhart@gmail.com
> wrote:

> I did several tests by adding additional console output.
> The build was always running on H11 and sometimes it was working, while in
> other cases not.
>
> Inspecting the byte code did not show anything unusual, it looks correct.
> Maybe it is related to jacoco, as it adds a javaagent which manipulates
> bytecode during the execution.
>
> I will do further tests to see if there is a connection.
>
> Thomas
>
> On Tue, Jan 13, 2015 at 2:50 AM, sebb <sebbaz@gmail.com> wrote:
>
>> On 13 January 2015 at 01:12, sebb <sebbaz@gmail.com> wrote:
>> > On 13 January 2015 at 00:53, Phil Steitz <phil.steitz@gmail.com> wrote:
>> >> On 1/12/15 5:44 PM, sebb wrote:
>> >>> On 12 January 2015 at 22:21, Thomas Neidhart <
>> thomas.neidhart@gmail.com> wrote:
>> >>>> On 01/12/2015 11:17 PM, Phil Steitz wrote:
>> >>>>> On 1/12/15 2:30 PM, Thomas Neidhart wrote:
>> >>>>>> On 01/12/2015 10:26 PM, Thomas Neidhart wrote:
>> >>>>>>> On 01/12/2015 08:09 PM, Phil Steitz wrote:
>> >>>>>>>> On 1/12/15 11:37 AM, sebb wrote:
>> >>>>>>>>> On 12 January 2015 at 18:11, Phil Steitz <phil.steitz@gmail.com>
>> wrote:
>> >>>>>>>>>> On 1/12/15 10:50 AM, sebb wrote:
>> >>>>>>>>>>> On 11 January 2015 at 22:10, Phil Steitz
<
>> phil.steitz@gmail.com> wrote:
>> >>>>>>>>>>>> On 1/11/15 11:19 AM, Phil Steitz
wrote:
>> >>>>>>>>>>>>> On 1/10/15 10:49 PM, Phil Steitz
wrote:
>> >>>>>>>>>>>>>> On 1/9/15 6:09 PM, sebb
wrote:
>> >>>>>>>>>>>>>>> On 10 January 2015 at
01:01, Phil Steitz <
>> phil.steitz@gmail.com> wrote:
>> >>>>>>>>>>>>>>>> On 1/9/15 5:32 PM,
sebb wrote:
>> >>>>>>>>>>>>>>>>> On 9 January
2015 at 23:48, sebb <sebbaz@gmail.com>
>> wrote:
>> >>>>>>>>>>>>>>>>>> Of the last
6 runs, only 1 had a problem with unit
>> test failures.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> All the
builds ran on ubuntu3, apart from the failure
>> which ran on H10.
>> >>>>>>>>>>>>>>>>>> This may
have some bearing on the result; I don't yet
>> know.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I had a
quick look at 2 tests that failed:
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfect
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> SimpleRegressionTest.testPerfectNegative
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Although
the test case has some instance data, these
>> particular tests
>> >>>>>>>>>>>>>>>>>> do not use
any, so it does not look like a concurrency
>> issue in the
>> >>>>>>>>>>>>>>>>>> unit test
itself.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> The SimpleRegression
class has mutable instance data,
>> but the test
>> >>>>>>>>>>>>>>>>>> cases create
their own instance.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I don't
know anything about the math functions
>> involved, but it looks
>> >>>>>>>>>>>>>>>>>> as though
Infinity might result from getSignificance()
>> if
>> >>>>>>>>>>>>>>>>>> getSlopeStdErr()
returns 0, as the latter is used as a
>> divisor. Or if
>> >>>>>>>>>>>>>>>>>> the field
sumXX is 0 because that is also used as a
>> divisor.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Maybe the
H10 host has different floating point
>> hardware?
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I'll try
running some more tests on H10.
>> >>>>>>>>>>>>>>>>> the build failed
again on H10; exactly the same tests
>> failed as before:
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> This test:
>> >>>>>>>>>>>>>>>>>
>> https://builds.apache.org/job/Commons%20Math%20H10/1/console
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Previous failure:
>> >>>>>>>>>>>>>>>>> https://builds.apache.org/job/Commons%20Math/14/console
>> >>>>>>>>>>>>>>>> This is actually
a bug.  Thanks, sebb (and Jenkins)!
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Has been here since
1.x.  What is going on is that the
>> data sets
>> >>>>>>>>>>>>>>>> used in the test
cases are set up to be perfect linear
>> >>>>>>>>>>>>>>>> relationships, which
should in fact lead to mean square
>> error (and
>> >>>>>>>>>>>>>>>> hence slope standard
error) equal to 0.  The Jenkins box
>> must be
>> >>>>>>>>>>>>>>>> getting exact 0.
 The funny thing is the test is there
>> to validate
>> >>>>>>>>>>>>>>>> correct performance
for models like this.  Its success
>> unfortunately
>> >>>>>>>>>>>>>>>> depends on poor
precision.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> I will open a JIRA
for this.  I don't think it is a
>> release blocker
>> >>>>>>>>>>>>>>>> for 3.4.1, as I
am sure you would get the same thing in
>> any earlier
>> >>>>>>>>>>>>>>>> version of [math].
>> >>>>>>>>>>>>>>> OK good to know.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I'll leave the H10 Jenkins
job for now to make it easy to
>> retest.
>> >>>>>>>>>>>>>> My first guess here was
wrong.  The infinities are being
>> handled
>> >>>>>>>>>>>>>> correctly for the JDKs I
have.  Something must be going
>> awry in the
>> >>>>>>>>>>>>>> t distribution cumulative
probability computation for +INF
>> on the
>> >>>>>>>>>>>>>> box that is failing.  Is
there a way to find out exactly
>> what JDK
>> >>>>>>>>>>>>>> and OS version are being
used?
>> >>>>>>>>>>>>> I just committed a test that
tests the t distribution
>> computations
>> >>>>>>>>>>>>> directly.  It seems to have
run clean; but the other test
>> ran clean
>> >>>>>>>>>>>>> too.  Is there any way to force
the build to use the host
>> that fails?
>> >>>>>>>>>>>> I can't make any sense of what is
going on with the Jenkins
>> builds.
>> >>>>>>>>>>>> Clean runs and then lots of errors.
 This one explains the
>> >>>>>>>>>>>> SimpleRegression "problem" (which
is not a problem with that
>> class
>> >>>>>>>>>>>> at least)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> testCumulativeProbablilityExtremes(org.apache.commons.math3.distribution.TDistributionTest)
>> Time elapsed: 0.001 sec  <<< FAILURE!
>> >>>>>>>>>>>> java.lang.AssertionError: expected:<1.0>
but was:<-Infinity>
>> >>>>>>>>>>>>         at org.junit.Assert.fail(Assert.java:88)
>> >>>>>>>>>>>>         at org.junit.Assert.failNotEquals(Assert.java:743)
>> >>>>>>>>>>>>         at org.junit.Assert.assertEquals(Assert.java:494)
>> >>>>>>>>>>>>         at org.junit.Assert.assertEquals(Assert.java:592)
>> >>>>>>>>>>>>         at
>> org.apache.commons.math3.distribution.TDistributionTest.testCumulativeProbablilityExtremes(TDistributionTest.java:109)
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Earlier runs this ran clean. There
is nothing
>> non-deterministic about this test (or quite a few of the others that
>> randomly seem to fail).
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> I wonder if we have a bad cpu or
something somewhere.
>> >>>>>>>>>>> AFAICS all the failed builds ran on
H10.
>> >>>>>>>>>>>
>> >>>>>>>>>>> IMO it is consistent; the apparent randomness
comes from the
>> fact the
>> >>>>>>>>>>> there are several Ubuntu hosts, including
H10.
>> >>>>>>>>>> Am I reading it / looking at the wrong one,
or did this one
>> succeed?
>> >>>>>>>>>>
>> >>>>>>>>>> https://builds.apache.org/view/All/job/Commons%20Math%20H10/6/
>> >>>>>>>>>>
>> >>>>>>>>>> That one was right after I added tests confirming
that the t
>> >>>>>>>>>> distribution cum prob handles INFs correctly.
>> >>>>>>>>> That did run on H10 and did succeed; I'd not
noticed that one
>> before.
>> >>>>>>>>>
>> >>>>>>>>> I think it is still true that the failures have
only occurred
>> on H10.
>> >>>>>>>>>
>> >>>>>>>>> However, the latest one is failing:
>> >>>>>>>>>
>> >>>>>>>>> https://builds.apache.org/job/Commons%20Math/24/console
>> >>>>>>>>>
>> >>>>>>>>> This is on H11 - I think that's the first time
H11 has been
>> used.
>> >>>>>>>>>
>> >>>>>>>>> I suppose it's possible that H10 and H11 have
a common failing,
>> but it
>> >>>>>>>>> seems less likely.
>> >>>>>>>>>
>> >>>>>>>>> I added a bit more debug - showing the value
of sumXX - but
>> that seems
>> >>>>>>>>> OK on H11.
>> >>>>>>>>>
>> >>>>>>>>> I just added a bit more debug.
>> >>>>>>>> I am pretty sure the SimpleRegressionTest failure
is actually
>> cause
>> >>>>>>>> by the same thing causing the t-distribution test
to fail (the
>> >>>>>>>> reason I added that one).
>> >>>>>>>>
>> >>>>>>>> One that is more straightforward to chase is this
one, which
>> fails
>> >>>>>>>> pretty consistently when "bad things happen"
>> >>>>>>>>
>> >>>>>>>> testExpInf(org.apache.commons.math3.complex.ComplexTest)
 Time
>> elapsed: 0.001 sec  <<< FAILURE!
>> >>>>>>>> java.lang.AssertionError: expected:<0.0> but
was:<Infinity>
>> >>>>>>>>    at org.junit.Assert.fail(Assert.java:88)
>> >>>>>>>>    at org.junit.Assert.failNotEquals(Assert.java:743)
>> >>>>>>>>    at org.junit.Assert.assertEquals(Assert.java:494)
>> >>>>>>>>    at org.junit.Assert.assertEquals(Assert.java:592)
>> >>>>>>>>    at
>> org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:76)
>> >>>>>>>>    at
>> org.apache.commons.math3.TestUtils.assertSame(TestUtils.java:84)
>> >>>>>>>>    at
>> org.apache.commons.math3.complex.ComplexTest.testExpInf(ComplexTest.java:788)
>> >>>>>>>>
>> >>>>>>>> I would wager that what is going on here is 0.0
* -INF = INF.
>> >>>>>>> The output returned by the debug statements added by
sebb is:
>> >>>>>>>
>> >>>>>>> expReal=Infinity
>> >>>>>>> cosImag=0.5403023058681398
>> >>>>>>> sinImag=0.8414709848078965
>> >>>>>>> result=(Infinity, Infinity)
>> >>>>>>>
>> >>>>>>> while expReal should be -Infinity.
>> >>>>>>>
>> >>>>>>> of course, Math.exp(Infinity) = Infinity.
>> >>>>>> oh stupid mistake, please forget my last post.
>> >>>>>> I messed up expReal with the actual real value.
>> >>>>> But it should be 0, since expReal should be exp(-INF)
>> >>>> just added a few more debug output to the test and the result is:
>> >>>>
>> >>>> real=-Infinity
>> >>>> -real=2147483647
>> >>>> expReal=Infinity
>> >>>>
>> >>>> according to FastMath.exp(), with these values, the code path should
>> be
>> >>>> as follows:
>> >>>>
>> >>>>         if (x < 0.0) {
>> >>>>             intVal = (int) -x;
>> >>>>
>> >>>>             if (intVal > 746) {
>> >>>>                 if (hiPrec != null) {
>> >>>>                     hiPrec[0] = 0.0;
>> >>>>                     hiPrec[1] = 0.0;
>> >>>>                 }
>> >>>> -->             return 0.0;
>> >>>>             }
>> >>>>
>> >>>>
>> >>>> but obviously it doesn't do this. I guess we can only inspect the
>> >>>> generated class files for a potential compiler bug.
>> >>> That suggests there should be some additional FastMath tests to show
>> >>> the underlying error.
>> >
>> > Actually there is such a test and it also fails.
>> >
>> >>> Perhaps compare with the basic Math versions where relevant.
>> >>
>> >> What Thomas is pointing out is that the code is not executing
>> >> correctly, unless we are missing something.  This has nothing to do
>> >> with FastMath vs. Math.
>> >
>> > I was just suggesting that it would be worth checking Math to make
>> > sure it does not behave the same way.
>> >
>> >>  Did we ever find out what the JDK is?
>> >
>> > No, but could probably add debug to show the java system variables.
>>
>> I've added a new test with that, plus a test of Math.exp()
>>
>> >> Given the sporadic nature of the failures (different tests failing
>> >> different times),
>> >
>> > Are we sure different tests are failing?
>> > I've not checked that.
>> >
>> > The same hosts fail each time.
>>
>> Oops, no they don't always fail.
>>
>> >> I wonder if there is some kind of storage /
>> >> filesystem or cpu corruption going on.  Do the Jenkins slaves share
>> >> a common file system or disk array?  Are they virtual hosts?
>> >
>> > No idea.
>> > I just think its suspicious that the failures seem to be repeatable.
>> > If a host fails once, it always fails.
>>
>> Oops - not so, they sometimes work...
>>
>> > Hardware failures tend to be more random, and given the test
>> > environment it's unlikely that the same memory will be used each time.
>>
>> >> Phil
>> >>>
>> >>>> Thomas
>> >>>>
>> >>>>> Phil
>> >>>>>> Thomas
>> >>>>>>
>> >>>>>>> Thomas
>> >>>>>>>
>> >>>>>>>>> I can perhaps change the H10 job to additionally
run on H11.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>> Phi
>> >>>>>>>>>>>> Phil
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>> Phil
>> >>>>>>>>>>>>>> Phil
>> >>>>>>>>>>>>>>>> Phil
>> >>>>>>>>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>>>>>>>> To unsubscribe,
e-mail:
>> dev-unsubscribe@commons.apache.org
>> >>>>>>>>>>>>>>>>> For additional
commands, e-mail:
>> dev-help@commons.apache.org
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>>>>>>> To unsubscribe,
e-mail:
>> dev-unsubscribe@commons.apache.org
>> >>>>>>>>>>>>>>>> For additional commands,
e-mail:
>> dev-help@commons.apache.org
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>>>>>> To unsubscribe, e-mail:
>> dev-unsubscribe@commons.apache.org
>> >>>>>>>>>>>>>>> For additional commands,
e-mail:
>> dev-help@commons.apache.org
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>>>>>>>> For additional commands, e-mail:
dev-help@commons.apache.org
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>>>>>>
>> >>>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> ---------------------------------------------------------------------
>> >>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>>>>
>> >>>>>>
>> ---------------------------------------------------------------------
>> >>>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> ---------------------------------------------------------------------
>> >>>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>>
>> >>>>
>> >>>> ---------------------------------------------------------------------
>> >>>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>>
>> >>> ---------------------------------------------------------------------
>> >>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >>> For additional commands, e-mail: dev-help@commons.apache.org
>> >>>
>> >>>
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> >> For additional commands, e-mail: dev-help@commons.apache.org
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
>> For additional commands, e-mail: dev-help@commons.apache.org
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message