zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: Trying to find pattern in Flaky Tests
Date Wed, 18 Jul 2018 18:16:23 GMT
Ok, I committed a change that seems to address the main failure:
https://github.com/apache/zookeeper/commit/06b9507ab78a1a055b8f467846c15791600b72ee

https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-Find-Flaky-Tests/lastSuccessfulBuild/artifact/report.html

However I do notice some oddness in the sense that for some jobs/runs it
fails to get the information from the REST interface, even though it's fine
for most of them, take a look, any ideas?
https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-Find-Flaky-Tests/456/console

[ZooKeeper-Find-Flaky-Tests] $ /bin/bash /tmp/jenkins4452773653790031730.sh
ERROR:__main__:failed to get:
https://builds.apache.org/job/ZooKeeper-trunk/108/testReport/api/json?tree=suites%5Bname%2Ccases%5BclassName%2Cname%2Cstatus%5D%5D
ERROR:__main__:failed to get:
https://builds.apache.org/job/ZooKeeper-trunk/104/testReport/api/json?tree=suites%5Bname%2Ccases%5BclassName%2Cname%2Cstatus%5D%5D
ERROR:__main__:failed to get:
https://builds.apache.org/job/ZooKeeper-trunk/100/testReport/api/json?tree=suites%5Bname%2Ccases%5BclassName%2Cname%2Cstatus%5D%5D


Notice that it doesn't complain about job 107 (etc...)

Any ideas on this? Have you seen this before? Perhaps we should open an
INFRA jira?

Patrick

On Wed, Jul 18, 2018 at 10:52 AM Patrick Hunt <phunt@apache.org> wrote:

> FYI, created this:
> https://issues.apache.org/jira/browse/INFRA-16785
> for the security warnings, not sure if that's causing the issue. Likely
> it's the recent jenkins upgrade, looking into it a bit...
>
> Patrick
>
>
> On Wed, Jul 18, 2018 at 9:48 AM Michael Han <hanm@apache.org> wrote:
>
>> Hi Andor,
>>
>> >> I suspect it should succeed eventually if we were to increase the
>> timeout even more. But is that correct? Bug or infrastructure issue?
>>
>> You could set up a dedicated git branch with all patches (e.g. the one in
>> ZOOKEEPER-2251) you want to apply and I can set up a dedicated Jenkins job
>> that points to this branch and stress test the entire unit test suite.
>> Some
>> tests are only flaky when they ran on Apache infrastructure and when they
>> ran together.
>>
>> It would be interesting to figure out what cause this test fail. Since
>> same
>> test works reliably in 3.4, there must be some commits in 3.5 that we
>> could
>> possibly blame...
>>
>> >> I'm going to raise a ticket on that if somebody willing to fix it.
>>
>> I just had a brief look before Jenkins is down. Looks like python was
>> complaining about some SSL stuff and I suspect if we upgrade to use later
>> version of python (3.x) it might work. I'll try that later when Jenkins is
>> back.
>>
>>
>> On Wed, Jul 18, 2018 at 8:42 AM, Andor Molnar <andor@cloudera.com.invalid
>> >
>> wrote:
>>
>> > Hi,
>> >
>> > *branch-3.4*
>> >
>> > I've taken a quick look at our Jenkins builds and in terms of flaky
>> tests,
>> > it looks like branch-3.4 is in a pretty good shape. The build hasn't
>> failed
>> > for 5-6 days on all JDKs which I think is pretty awesome.
>> >
>> > *branch-3.5*
>> >
>> > This branch is in very bad condition. Which is quite unfortunate given
>> > we're in the middle of stabilising it. :)
>> > Especially on JDK8, last successful build was 11 days ago. JDK9 (50%
>> > failing) and JDK10 (30% failing) are looking better in the last 10
>> builds.
>> >
>> > Interestingly (apart from a few quite rare ones) it looks there's only 1
>> > test which is quite nasty on this branch: testManyChildWatchersAutoReset
>> >
>> > There's a Jira about fixing it and a fix has been merged by increasing
>> the
>> > timeout of the test, but having a bug on the branch is also possible
>> > causing the test to fail even with 10 min timeout.
>> >
>> > I wasn't able to repro the failing test on my machine (Mac and
>> CentOS7), it
>> > always finished in 30-40 seconds maximum. On jenkins slaves it shows the
>> > following:
>> >
>> > *JDK 8:*
>> >
>> > Report creation timed out.
>> >
>> >
>> > *JDK 9:*
>> >
>> > New Failures
>> > Chart
>> > See children
>> > Build Number ⇒
>> > Package-Class-Testmethod names ⇓
>> > 351
>> > 350
>> > 349
>> > 348
>> > 347
>> > 346
>> > 345
>> > 344
>> > 343
>> > 342
>> > 341
>> > 340
>> > 339
>> > 338
>> > 337
>> > 336
>> > 335
>> > 334
>> >  testManyChildWatchersAutoReset
>> > 45.604
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/351/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.337
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/350/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 21.904
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/349/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 583.063
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/348/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.325
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/347/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.383
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/346/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.362
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/345/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 21.139
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/344/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 24.031
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/343/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 584.200
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/342/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.327
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/341/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.323
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/340/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 23.737
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/339/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.406
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/338/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 547.004
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/337/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.393
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/336/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > N/A
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/test_results_analyzer/>
>> > 373.955
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java9/334/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> >
>> >
>> > *JDK 10:*
>> >
>> >
>> > New Failures
>> > Chart
>> > See children
>> > Build Number ⇒
>> > Package-Class-Testmethod names ⇓
>> > 110
>> > 109
>> > 108
>> > 107
>> > 106
>> > 105
>> > 104
>> > 103
>> > 102
>> > 101
>> > 100
>> > 99
>> > 98
>> > 97
>> > 96
>> > 95
>> > 94
>> > 93
>> > 92
>> >  testManyChildWatchersAutoReset
>> > 364.945
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/110/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 543.983
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/109/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 388.182
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/108/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.446
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/107/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.025
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/106/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 535.046
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/105/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.306
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/104/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 474.005
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/103/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 560.925
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/102/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.328
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/101/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 558.547
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/100/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.397
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/99/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.414
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/98/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 430.383
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/97/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 564.064
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/96/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 600.357
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/95/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 432.435
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/94/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 596.378
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/93/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> > 39.242
>> > <https://builds.apache.org/view/S-Z/view/ZooKeeper/job/
>> > ZooKeeper_branch35_java10/92/testReport/org.apache.zookeeper.test/
>> > DisconnectedWatcherTest/testManyChildWatchersAutoReset>
>> >
>> >
>> > It takes ages to complete on Jenkins for some reason and it looks like
>> it
>> > ends quite frequently close to the limit, so I suspect it should succeed
>> > eventually if we were to increase the timeout even more. But is that
>> > correct?
>> > Bug or infrastructure issue?
>> >
>> > *master / 3.6*
>> >
>> > Pretty much the same as 3.5. I haven't seen
>> testManyChildWatchersAutoReset
>> > failing on this branch with JDK8 which is a bit confusing, but other
>> then
>> > that I see the same pattern on JDK9 and JDK10. Unable to generate the
>> above
>> > reports here, because Test Result Analyzer keep timeouting for me, but
>> I'll
>> > follow-up when I have them.
>> >
>> > Btw. Flaky Test report has been broken for 10 days, I'm going to raise a
>> > ticket on that if somebody willing to fix it. (I'm planning to do so.)
>> > It would be nice to see the report working again, because if my
>> > observations are correct, we don't have too many annoying tests apart
>> from
>> > the one mentioned.
>> >
>> > Thanks,
>> > Andor
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message