From dev-return-71650-archive-asf-public=cust-asf.ponee.io@zookeeper.apache.org Fri Jul 20 09:01:41 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id D7287180663 for ; Fri, 20 Jul 2018 09:01:40 +0200 (CEST) Received: (qmail 79366 invoked by uid 500); 20 Jul 2018 07:01:39 -0000 Mailing-List: contact dev-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@zookeeper.apache.org Delivered-To: mailing list dev@zookeeper.apache.org Received: (qmail 79355 invoked by uid 99); 20 Jul 2018 07:01:39 -0000 Received: from mail-relay.apache.org (HELO mailrelay2-lw-us.apache.org) (207.244.88.137) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 20 Jul 2018 07:01:39 +0000 Received: from mail-wr1-f52.google.com (mail-wr1-f52.google.com [209.85.221.52]) by mailrelay2-lw-us.apache.org (ASF Mail Server at mailrelay2-lw-us.apache.org) with ESMTPSA id 790B31FC7 for ; Fri, 20 Jul 2018 07:01:38 +0000 (UTC) Received: by mail-wr1-f52.google.com with SMTP id e7-v6so10229751wrs.9 for ; Fri, 20 Jul 2018 00:01:38 -0700 (PDT) X-Gm-Message-State: AOUpUlGkYTgLWaAs0iuiVLnW102CDoELHpirM3WuNebGLHBoR9rhwIVB +80YLBCHbWiEmxV7Ri4ES4tF8A0H5HD85xArtW4= X-Google-Smtp-Source: AAOMgpf/SC85VmJkUZLIcwFjC8yZnwtCGGNStIGbe6ppGq9Vxq6Lg7w8RzD4tazY8xx4LNaXFMxneBQeKMn0oO/NBx8= X-Received: by 2002:adf:9954:: with SMTP id x78-v6mr585288wrb.178.1532070096599; Fri, 20 Jul 2018 00:01:36 -0700 (PDT) MIME-Version: 1.0 References: <16bea9c9-e767-cf1f-3f7c-d5d530346a3f@nu.hu> In-Reply-To: <16bea9c9-e767-cf1f-3f7c-d5d530346a3f@nu.hu> From: Patrick Hunt Date: Fri, 20 Jul 2018 00:01:00 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Trying to find pattern in Flaky Tests To: DevZooKeeper Content-Type: multipart/alternative; boundary="000000000000e1011d057168dd95" --000000000000e1011d057168dd95 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Something that's significantly different about the 3.4 and 3.5/master Jenkins jobs is that 3.5/master has test.junit.threads=3D8 set while this is not supported in 3.4 (see build.xml). It's very likely that the paralyzation of the tests is causing the discrepancy. setting threads > 1 significantly improves the speed of the jobs, that's why it was originally added to 3.5+. See a358280fb2b3cc7852cded3fe67769765a519beb Perhaps we should try one/more of the 3.5/master jobs with threads=3D1 and see? Patrick On Thu, Jul 19, 2018 at 1:26 PM Moln=C3=A1r Andor wrote: > Sorry guys for this aweful email. Looks like Apache converted my nicely > illustrated email into plain text. :( > > Maybe I could attach the test reports as images, but I think you already > got the idea. > > > Andor > > > > On 07/18/2018 05:42 PM, Andor Molnar wrote: > > Hi, > > > > *branch-3.4* > > > > I've taken a quick look at our Jenkins builds and in terms of flaky > tests, > > it looks like branch-3.4 is in a pretty good shape. The build hasn't > failed > > for 5-6 days on all JDKs which I think is pretty awesome. > > > > *branch-3.5* > > > > This branch is in very bad condition. Which is quite unfortunate given > > we're in the middle of stabilising it. :) > > Especially on JDK8, last successful build was 11 days ago. JDK9 (50% > > failing) and JDK10 (30% failing) are looking better in the last 10 > builds. > > > > Interestingly (apart from a few quite rare ones) it looks there's only = 1 > > test which is quite nasty on this branch: testManyChildWatchersAutoRese= t > > > > There's a Jira about fixing it and a fix has been merged by increasing > the > > timeout of the test, but having a bug on the branch is also possible > > causing the test to fail even with 10 min timeout. > > > > I wasn't able to repro the failing test on my machine (Mac and CentOS7)= , > it > > always finished in 30-40 seconds maximum. On jenkins slaves it shows th= e > > following: > > > > *JDK 8:* > > > > Report creation timed out. > > > > > > *JDK 9:* > > > > New Failures > > Chart > > See children > > Build Number =E2=87=92 > > Package-Class-Testmethod names =E2=87=93 > > 351 > > 350 > > 349 > > 348 > > 347 > > 346 > > 345 > > 344 > > 343 > > 342 > > 341 > > 340 > > 339 > > 338 > > 337 > > 336 > > 335 > > 334 > > testManyChildWatchersAutoReset > > 45.604 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/351/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.337 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/350/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 21.904 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/349/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 583.063 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/348/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.325 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/347/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.383 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/346/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.362 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/345/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 21.139 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/344/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 24.031 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/343/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 584.200 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/342/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.327 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/341/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.323 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/340/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 23.737 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/339/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.406 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/338/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 547.004 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/337/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.393 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/336/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > N/A > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/test_results_analyzer/ > > > > 373.955 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java9/334/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > > > > > *JDK 10:* > > > > > > New Failures > > Chart > > See children > > Build Number =E2=87=92 > > Package-Class-Testmethod names =E2=87=93 > > 110 > > 109 > > 108 > > 107 > > 106 > > 105 > > 104 > > 103 > > 102 > > 101 > > 100 > > 99 > > 98 > > 97 > > 96 > > 95 > > 94 > > 93 > > 92 > > testManyChildWatchersAutoReset > > 364.945 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/110/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 543.983 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/109/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 388.182 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/108/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 600.446 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/107/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 600.025 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/106/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 535.046 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/105/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 600.306 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/104/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 474.005 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/103/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 560.925 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/102/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 600.328 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/101/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 558.547 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/100/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/tes= tManyChildWatchersAutoReset > > > > 600.397 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/99/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.414 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/98/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 430.383 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/97/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 564.064 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/96/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 600.357 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/95/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 432.435 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/94/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 596.378 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/93/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > 39.242 > > < > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper_branch35_= java10/92/testReport/org.apache.zookeeper.test/DisconnectedWatcherTest/test= ManyChildWatchersAutoReset > > > > > > > > It takes ages to complete on Jenkins for some reason and it looks like = it > > ends quite frequently close to the limit, so I suspect it should succee= d > > eventually if we were to increase the timeout even more. But is that > > correct? > > Bug or infrastructure issue? > > > > *master / 3.6* > > > > Pretty much the same as 3.5. I haven't seen > testManyChildWatchersAutoReset > > failing on this branch with JDK8 which is a bit confusing, but other th= en > > that I see the same pattern on JDK9 and JDK10. Unable to generate the > above > > reports here, because Test Result Analyzer keep timeouting for me, but > I'll > > follow-up when I have them. > > > > Btw. Flaky Test report has been broken for 10 days, I'm going to raise = a > > ticket on that if somebody willing to fix it. (I'm planning to do so.) > > It would be nice to see the report working again, because if my > > observations are correct, we don't have too many annoying tests apart > from > > the one mentioned. > > > > Thanks, > > Andor > > > > --000000000000e1011d057168dd95--