lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler (JIRA)" <>
Subject [jira] [Commented] (SOLR-12016) Reduce noise from flakey tests
Date Sun, 25 Feb 2018 11:23:00 GMT


Uwe Schindler commented on SOLR-12016:

Hi [~erickerickson]: I updated the patch. During my merge actions, I missed to actually enable
{{@BadApple}} tests - I just updated all comments, but the main change in the {{@TestGroup}}
annotation was lost. Now it's fine.

I still have not tested the smoker change.

> Reduce noise from flakey tests
> ------------------------------
>                 Key: SOLR-12016
>                 URL:
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: Tests
>    Affects Versions: 7.2, master (8.0)
>            Reporter: Erick Erickson
>            Assignee: Erick Erickson
>            Priority: Major
>         Attachments: SOLR-12016-buildsystem.patch, SOLR-12016-buildsystem.patch, SOLR-12016-buildsystem.patch
> We had a discussion of this topic on the dev list, look for a thread titled: "Test failures
are out of control.....". I'll try to summarize that discussion here and we can move this
JIRA forward. This may become an umbrella issue.
> Current situation concerns:
> > There is so much noise from flakey tests (particularly Solr tests) that they are
difficult to use.
> > The number of tests that regularly fail is increasing
> > Failures are being ignored
> > The number of failing tests makes releasing more difficult.
> > The number of failing tests make it harder to determine whether recent changes actually
caused problems. Running the tests again until they succeed is used commonly at present, which
is not robust.
> > e-mail notifications of failing tests are largely being ignored.
> Propsal:
> > Mark all currently "flakey" tests as BadApple or AwaitsFix
> > Run Jenkins jobs with BadApple (and/or AwaitsFix) enabled and disabled. Frequency
TBD, depends partly on whether we can label emails from these runs for easy filtering of the
two flavors.
> >> Label these runs with something suitable in the subject line (wish list)
> > Weekly reports on the tests labeled BadApple or AwaitsFix
> >> Perhaps this could be incorporated in the reports linked below (wish list)
> > Committers should enable BadApple (or AwaitsFix) regularly as a sanity check. Leave
these as defaults.
> > We start getting _much_ more aggressive about not allowing _new_ flakey tests.
> NOTE: It's perfectly acceptable to have failing flakey tests as long as someone is activey
working on _fixing_ them.
> Concerns with solution
> > Decreases test coverage
> > Decreases visibility of flakey tests, making fixing them less likely.
> > Some tools (see below) that report on bad tests will not see tests that are annotated
with BadApple or AwaitsFix.
> > Running unit tests and reporting errors are being conflated
> To be decided:
> > Can we label e-mails with failing tests with something in the subject line identifying
whether they were run with BadApple/Awaits fix enabled or disabled? Can someone volunteer?
> > Is there any difference between BadApple and AwaitsFix? If not should we deprecate
one? I propose we just use AwaitsFix and deprecate BadApple.
> > Can the automated reports (see below) be enhanced to also report tests labeled BadApple
or AwaitsFix?
> Useful tools:
> > Steve Rowe's work on a Jenkins job to reproduce test failures (LUCENE-8106) 
> > Hoss has worked on aggregating all test failures from the 3 Jenkins systems (ASF,
Policeman, and Steve's), downloading the test results & logs, and running some reports/stats
on failures.
>   >>
>   >>
>   >>
> I've assigned this JIRA to myslef, but all volunteers welcome, especially anything that
changes the build system.....
> I've decided to make this a SOLR jira on the theory that most of the offending tests
are in the Solr hive, any sub-tasks for touching the build system can go under LUCENE if wanted.
> Also, I expect to add the annotation to some more tests for a few days as infrequent
failures occur. Once we have stability (defined by there being little noise) that'll stop.
> 3 BadApple 23 AwaitsFix annotations are currently in the code, linked to these issues:
> HADOOP-14044
> HADOOP-9893
> LUCENE-3869
> LUCENE-5575")
> LUCENE-5595
> LUCENE-5737
> LUCENE-6709
> LUCENE-7161
> SOLR-2715
> SOLR-6213
> SOLR-6443
> SOLR-6944
> SOLR-7736
> SOLR-9036
> SOLR-10071
> SOLR-10107
> SOLR-10136
> SOLR-10734
> SOLR-10191
> SOLR-11134
> SOLR-11458
> SOLR-11714
> SOLR-11974
> Solr JIRAS about bad tests
> SOLR-2175
> SOLR-4147
> SOLR-5880
> SOLR-6423
> SOLR-6944
> SOLR-6961
> SOLR-6974
> SOLR-8122
> SOLR-8182
> SOLR-9869
> SOLR-10053
> SOLR-10070
> SOLR-10071
> SOLR-10139
> SOLR-10287
> SOLR-10815
> SOLR-11911

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message