mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marco de Abreu <marco.g.ab...@googlemail.com>
Subject Re: Call for Help for Fixing Flaky Tests
Date Sun, 14 Jan 2018 20:52:29 GMT
Sheng, could you provide a list of tests which you would cover with the
flaky-plugin? I totally agree with the point that we should not create a
release if we have reduced test coverage and it should be our highest
priority to restore it properly. I'd propose that if a test takes less than
5 seconds, it can be covered by the flaky-plugin with a retry-count of 5.
Flaky tests which take longer than 5 seconds have to be fixed before
reenabling and must not be using the flaky-plugin in order to address
Bhavins concerns.

I'd propose against the nightly solution as this basically limits
visibility of results to Amazon-employees - nobody else really interacts
with that CI system and results are not directly reported (except if we
take some effort to create notifications etc, but the time is better spent
in actually fixing the tests).

-Marco

On Sun, Jan 14, 2018 at 9:49 PM, Sheng Zha <zhasheng@apache.org> wrote:

> Hi Bhavin,
>
> Thank you for the support. Running it nightly is a great idea in that it
> doesn't compromise the coverage and we can still get notified fairly soon
> when things are breaking. Is there a way to subscribe to its result report?
>
> -sz
>
> On 2018-01-14 12:28, Bhavin Thaker <bhavinthaker@gmail.com> wrote:
> > Hi Sheng,
> >
> > I agree with doubling-down on the efforts to fix the flaky tests but do
> not
> > agree with compromising the stability of the test automation.
> >
> > As a compromise, we could probably run the flaky tests as part of the
> > nightly test automation -- would that work?
> >
> > I like your suggestion of using this: https://pypi.python.org/pypi/flaky
> in
> > another email thread. May be we could have a higher rerun count as part
> of
> > the nightly test to have better test automation stability.
> >
> > Bhavin Thaker.
> >
> > On Sun, Jan 14, 2018 at 12:21 PM, Sheng Zha <zhasheng@apache.org> wrote:
> >
> > > Hi Bhavin,
> > >
> > > Thanks for sharing your thoughts. Regarding the usage of 'flaky' plugin
> > > for retrying flaky tests, it's proposed as a compromise, given that it
> will
> > > take time to properly fix the tests and we still need coverage in the
> > > meantime.
> > >
> > > I'm not sure if releasing before these tests are re-enabled should be
> the
> > > way, as it's not a good practice to release features that are not
> covered
> > > by tests. Having done it before doesn't make it right. In that sense,
> > > release efforts shouldn't be a blocker for re-enabling tests. Rather,
> it
> > > should be the other way around, and release should happen only after we
> > > recover the lost test coverage.
> > >
> > > I hope that we would do the right thing for our users. Thanks.
> > >
> > > -sz
> > >
> > > On 2018-01-14 11:00, Bhavin Thaker <bhavinthaker@gmail.com> wrote:
> > > > Hi Sheng,
> > > >
> > > > Thank you for your efforts and this proposal to improve the tests.
> Here
> > > are
> > > > my thoughts.
> > > >
> > > > Shouldn’t the focus be to _engineer_ each test to be reliable
> instead of
> > > > compromising and discussing the relative tradeoffs in re-enabling
> flaky
> > > > tests? Is the test failure probability really 10%?
> > > >
> > > > As you correctly mention, the experiences in making the tests
> reliable
> > > will
> > > > then serve as the standard for adding new tests rather than
> continuing to
> > > > chase the elusive goal of reliable tests.
> > > >
> > > > Hence, my non-binding vote is:
> > > > -1 for proposal #1 for renabling flaky tests.
> > > > +1 for proposal #2 for setting the standard for adding reliable
> tests.
> > > >
> > > > I suggest to NOT compromise on the quality and reliability of the
> tests,
> > > > similar to the high bar maintained for the MXNet source code.
> > > >
> > > > If the final vote is to re-enable flaky tests, then I propose that we
> > > > enable them immediately AFTER the next MXNet release instead of
> doing it
> > > > during the upcoming release.
> > > >
> > > > Bhavin Thaker.
> > > >
> > > > On Sat, Jan 13, 2018 at 2:20 PM, Marco de Abreu <
> > > > marco.g.abreu@googlemail.com> wrote:
> > > >
> > > > > Hello Sheng,
> > > > >
> > > > > thanks a lot for leading this task!
> > > > >
> > > > > +1 for both points. Additionally, I'd propose to add the
> requirement to
> > > > > specify a reason if a new test takes more than X seconds (say 10)
> or
> > > adds
> > > > > an external dependency.
> > > > >
> > > > > Looking forward to getting these tests fixed :)
> > > > >
> > > > > Best regards,
> > > > > Marco
> > > > >
> > > > > On Sat, Jan 13, 2018 at 11:14 PM, Sheng Zha <zhasheng@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi MXNet community,
> > > > > >
> > > > > > Thanks to the efforts of several community members, we identified
> > > many
> > > > > > flaky tests. These tests are currently disabled to ensure the
> smooth
> > > > > > execution of continuous integration (CI). As a result, we lost
> > > coverage
> > > > > on
> > > > > > those features. They need fixing and to be re-enabled to ensure
> the
> > > > > quality
> > > > > > of our releases. I'd like to propose the following:
> > > > > >
> > > > > > 1, Re-enable flaky python tests with retries if feasible
> > > > > > Although the tests are unstable, they would still be able to
> catch
> > > > > breaking
> > > > > > changes. For example, suppose a test fails randomly with 10%
> > > probability,
> > > > > > the probability of three failed retries become 0.1%. On the
other
> > > hand, a
> > > > > > breaking change would result in 100% failure. Although this
could
> > > > > increase
> > > > > > the testing time, it's a compromise that can help avoid bigger
> > > problem.
> > > > > >
> > > > > > 2, Set standard for new tests
> > > > > > I think having criteria that new tests should follow can help
> > > improve the
> > > > > > quality of tests, but also the quality of code. I propose the
> > > following
> > > > > > standard for tests.
> > > > > > - Reliably passing with good coverage
> > > > > > - Avoid randomness unless necessary
> > > > > > - Avoid external dependency unless necessary (e.g. due to
> license)
> > > > > > - Not resource-intensive unless necessary (e.g. scaling tests)
> > > > > >
> > > > > > In addition, I'd like to call for volunteers on helping with
the
> fix
> > > of
> > > > > > tests. New members are especially welcome, as it's a good
> > > opportunity to
> > > > > > familiarize with MXNet. Also, I'd like to request that members
> who
> > > wrote
> > > > > > the feature/test could help either by fixing, or by helping
> others
> > > > > > understand the issues.
> > > > > >
> > > > > > The effort on fixing the tests is tracked at:
> > > > > > https://github.com/apache/incubator-mxnet/issues/9412
> > > > > >
> > > > > > Best regards,
> > > > > > Sheng
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message