zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Trying to find pattern in Flaky Tests
Date Wed, 01 Aug 2018 19:11:44 GMT
Looks like 16808 has been resolved - I haven't noticed it after the recent
changes.

Note that INFRA recently added openjdk10 to Jenkins and I added a job or
two which seem to be working OK.

Java 11 is failing on 3.4 due to broken libraries (according to Rakesh on
another thread) but we're also seeing failures on trunk which are unrelated
to that issue. Perhaps someone can take a look?

Patrick

On Tue, Jul 24, 2018 at 3:49 PM Patrick Hunt <phunt@apache.org> wrote:

> FYI, there's also this which I just reported:
> https://issues.apache.org/jira/browse/INFRA-16808
>
> Patrick
>
> On Fri, Jul 20, 2018 at 12:01 AM Patrick Hunt <phunt@apache.org> wrote:
>
>> Something that's significantly different about the 3.4 and 3.5/master
>> Jenkins jobs is that 3.5/master has
>>
>> test.junit.threads=8
>>
>> set while this is not supported in 3.4 (see build.xml). It's very likely
>> that the paralyzation of the tests is causing the discrepancy.
>>
>> setting threads > 1 significantly improves the speed of the jobs, that's
>> why it was originally added to 3.5+.
>> See a358280fb2b3cc7852cded3fe67769765a519beb
>>
>> Perhaps we should try one/more of the 3.5/master jobs with threads=1 and
>> see?
>>
>> Patrick
>>
>>
>>
>> On Thu, Jul 19, 2018 at 1:26 PM Molnár Andor <andor@nu.hu> wrote:
>>
>>> Sorry guys for this aweful email. Looks like Apache converted my nicely
>>> illustrated email into plain text. :(
>>>
>>> Maybe I could attach the test reports as images, but I think you already
>>> got the idea.
>>>
>>>
>>> Andor
>>>
>>>
>>>
>>> On 07/18/2018 05:42 PM, Andor Molnar wrote:
>>> > Hi,
>>> >
>>> > *branch-3.4*
>>> >
>>> > I've taken a quick look at our Jenkins builds and in terms of flaky
>>> tests,
>>> > it looks like branch-3.4 is in a pretty good shape. The build hasn't
>>> failed
>>> > for 5-6 days on all JDKs which I think is pretty awesome.
>>> >
>>> > *branch-3.5*
>>> >
>>> > This branch is in very bad condition. Which is quite unfortunate given
>>> > we're in the middle of stabilising it. :)
>>> > Especially on JDK8, last successful build was 11 days ago. JDK9 (50%
>>> > failing) and JDK10 (30% failing) are looking better in the last 10
>>> builds.
>>> >
>>> > Interestingly (apart from a few quite rare ones) it looks there's only
>>> 1
>>> > test which is quite nasty on this branch:
>>> testManyChildWatchersAutoReset
>>> >
>>> > There's a Jira about fixing it and a fix has been merged by increasing
>>> the
>>> > timeout of the test, but having a bug on the branch is also possible
>>> > causing the test to fail even with 10 min timeout.
>>> >
>>> > I wasn't able to repro the failing test on my machine (Mac and
>>> CentOS7), it
>>> > always finished in 30-40 seconds maximum. On jenkins slaves it shows
>>> the
>>> > following:
>>> >
>>> > *JDK 8:*
>>> >
>>> > Report creation timed out.
>>> >
>>> >
>>> > *JDK 9:*
>>> >
>>> > New Failures
>>> > Chart
>>> > See children
>>> > Build Number ⇒
>>> > Package-Class-Testmethod names ⇓
>>> > 351
>>> > 350
>>> > 349
>>> > 348
>>> > 347
>>> > 346
>>> > 345
>>> > 344
>>> > 343
>>> > 342
>>> > 341
>>> > 340
>>> > 339
>>> > 338
>>> > 337
>>> > 336
>>> > 335
>>> > 334
>>> >  testManyChildWatchersAutoReset
>>> > 45.604
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/351/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.337
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/350/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 21.904
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/349/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 583.063
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/348/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.325
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/347/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.383
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/346/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.362
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/345/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 21.139
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/344/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 24.031
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/343/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 584.200
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/342/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.327
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/341/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.323
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/340/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 23.737
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/339/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.406
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/338/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 547.004
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/337/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.393
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/336/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > N/A
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/test_results_analyzer/
>>> >
>>> > 373.955
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java9/334/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> >
>>> >
>>> > *JDK 10:*
>>> >
>>> >
>>> > New Failures
>>> > Chart
>>> > See children
>>> > Build Number ⇒
>>> > Package-Class-Testmethod names ⇓
>>> > 110
>>> > 109
>>> > 108
>>> > 107
>>> > 106
>>> > 105
>>> > 104
>>> > 103
>>> > 102
>>> > 101
>>> > 100
>>> > 99
>>> > 98
>>> > 97
>>> > 96
>>> > 95
>>> > 94
>>> > 93
>>> > 92
>>> >  testManyChildWatchersAutoReset
>>> > 364.945
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/110/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 543.983
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/109/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 388.182
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/108/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.446
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/107/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.025
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/106/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 535.046
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/105/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.306
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/104/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 474.005
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/103/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 560.925
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/102/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.328
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/101/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 558.547
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/100/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.397
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/99/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.414
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/98/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 430.383
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/97/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 564.064
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/96/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 600.357
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/95/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 432.435
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/94/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 596.378
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/93/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> > 39.242
>>> > <
>>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_java10/92/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/testManyChildWatchersAutoReset
>>> >
>>> >
>>> >
>>> > It takes ages to complete on Jenkins for some reason and it looks like
>>> it
>>> > ends quite frequently close to the limit, so I suspect it should
>>> succeed
>>> > eventually if we were to increase the timeout even more. But is that
>>> > correct?
>>> > Bug or infrastructure issue?
>>> >
>>> > *master / 3.6*
>>> >
>>> > Pretty much the same as 3.5. I haven't seen
>>> testManyChildWatchersAutoReset
>>> > failing on this branch with JDK8 which is a bit confusing, but other
>>> then
>>> > that I see the same pattern on JDK9 and JDK10. Unable to generate the
>>> above
>>> > reports here, because Test Result Analyzer keep timeouting for me, but
>>> I'll
>>> > follow-up when I have them.
>>> >
>>> > Btw. Flaky Test report has been broken for 10 days, I'm going to raise
>>> a
>>> > ticket on that if somebody willing to fix it. (I'm planning to do so.)
>>> > It would be nice to see the report working again, because if my
>>> > observations are correct, we don't have too many annoying tests apart
>>> from
>>> > the one mentioned.
>>> >
>>> > Thanks,
>>> > Andor
>>> >
>>>
>>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message