mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bhavin Thaker <bhavintha...@gmail.com>
Subject Re: Call for Help for Fixing Flaky Tests
Date Sun, 14 Jan 2018 20:28:41 GMT
Hi Sheng,

I agree with doubling-down on the efforts to fix the flaky tests but do not
agree with compromising the stability of the test automation.

As a compromise, we could probably run the flaky tests as part of the
nightly test automation -- would that work?

I like your suggestion of using this: https://pypi.python.org/pypi/flaky in
another email thread. May be we could have a higher rerun count as part of
the nightly test to have better test automation stability.

Bhavin Thaker.

On Sun, Jan 14, 2018 at 12:21 PM, Sheng Zha <zhasheng@apache.org> wrote:

> Hi Bhavin,
>
> Thanks for sharing your thoughts. Regarding the usage of 'flaky' plugin
> for retrying flaky tests, it's proposed as a compromise, given that it will
> take time to properly fix the tests and we still need coverage in the
> meantime.
>
> I'm not sure if releasing before these tests are re-enabled should be the
> way, as it's not a good practice to release features that are not covered
> by tests. Having done it before doesn't make it right. In that sense,
> release efforts shouldn't be a blocker for re-enabling tests. Rather, it
> should be the other way around, and release should happen only after we
> recover the lost test coverage.
>
> I hope that we would do the right thing for our users. Thanks.
>
> -sz
>
> On 2018-01-14 11:00, Bhavin Thaker <bhavinthaker@gmail.com> wrote:
> > Hi Sheng,
> >
> > Thank you for your efforts and this proposal to improve the tests. Here
> are
> > my thoughts.
> >
> > Shouldn’t the focus be to _engineer_ each test to be reliable instead of
> > compromising and discussing the relative tradeoffs in re-enabling flaky
> > tests? Is the test failure probability really 10%?
> >
> > As you correctly mention, the experiences in making the tests reliable
> will
> > then serve as the standard for adding new tests rather than continuing to
> > chase the elusive goal of reliable tests.
> >
> > Hence, my non-binding vote is:
> > -1 for proposal #1 for renabling flaky tests.
> > +1 for proposal #2 for setting the standard for adding reliable tests.
> >
> > I suggest to NOT compromise on the quality and reliability of the tests,
> > similar to the high bar maintained for the MXNet source code.
> >
> > If the final vote is to re-enable flaky tests, then I propose that we
> > enable them immediately AFTER the next MXNet release instead of doing it
> > during the upcoming release.
> >
> > Bhavin Thaker.
> >
> > On Sat, Jan 13, 2018 at 2:20 PM, Marco de Abreu <
> > marco.g.abreu@googlemail.com> wrote:
> >
> > > Hello Sheng,
> > >
> > > thanks a lot for leading this task!
> > >
> > > +1 for both points. Additionally, I'd propose to add the requirement to
> > > specify a reason if a new test takes more than X seconds (say 10) or
> adds
> > > an external dependency.
> > >
> > > Looking forward to getting these tests fixed :)
> > >
> > > Best regards,
> > > Marco
> > >
> > > On Sat, Jan 13, 2018 at 11:14 PM, Sheng Zha <zhasheng@apache.org>
> wrote:
> > >
> > > > Hi MXNet community,
> > > >
> > > > Thanks to the efforts of several community members, we identified
> many
> > > > flaky tests. These tests are currently disabled to ensure the smooth
> > > > execution of continuous integration (CI). As a result, we lost
> coverage
> > > on
> > > > those features. They need fixing and to be re-enabled to ensure the
> > > quality
> > > > of our releases. I'd like to propose the following:
> > > >
> > > > 1, Re-enable flaky python tests with retries if feasible
> > > > Although the tests are unstable, they would still be able to catch
> > > breaking
> > > > changes. For example, suppose a test fails randomly with 10%
> probability,
> > > > the probability of three failed retries become 0.1%. On the other
> hand, a
> > > > breaking change would result in 100% failure. Although this could
> > > increase
> > > > the testing time, it's a compromise that can help avoid bigger
> problem.
> > > >
> > > > 2, Set standard for new tests
> > > > I think having criteria that new tests should follow can help
> improve the
> > > > quality of tests, but also the quality of code. I propose the
> following
> > > > standard for tests.
> > > > - Reliably passing with good coverage
> > > > - Avoid randomness unless necessary
> > > > - Avoid external dependency unless necessary (e.g. due to license)
> > > > - Not resource-intensive unless necessary (e.g. scaling tests)
> > > >
> > > > In addition, I'd like to call for volunteers on helping with the fix
> of
> > > > tests. New members are especially welcome, as it's a good
> opportunity to
> > > > familiarize with MXNet. Also, I'd like to request that members who
> wrote
> > > > the feature/test could help either by fixing, or by helping others
> > > > understand the issues.
> > > >
> > > > The effort on fixing the tests is tracked at:
> > > > https://github.com/apache/incubator-mxnet/issues/9412
> > > >
> > > > Best regards,
> > > > Sheng
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message