mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Olivier <cjolivie...@gmail.com>
Subject Re: Improving and rationalizing unit tests
Date Mon, 16 Oct 2017 16:22:42 GMT
My take on the suggestion of purely deterministic inputs is (including
deterministic seeding):

"I want the same values to be used for all test runs because it is
inconvenient when a unit test fails for some edge cases.  I prefer that
unforseen edge case failures occur in the field and not during testing".

Is this the motivation?  Seems strange to me.


On Mon, Oct 16, 2017 at 9:09 AM, Pedro Larroy <pedro.larroy.lists@gmail.com>
wrote:

> I think using a properly seeded and initialized (pseudo)random is actually
> beneficial (and deterministic), handpicked examples are usually too
> simplistic and miss corner cases.
>
> Better yet is to use property based testing, which will pick corner cases
> and do fuzzing automatically to check with high degree of confidence that a
> testing condition holds.
>
> Probably it would be good if we use a property based testing library in
> adition to nose to check invariants.
>
> A quick googling yields this one for python for example:
> https://hypothesis.readthedocs.io/en/latest/quickstart.html does anyone
> have experience or can recommend a nice property based testing library for
> python?
>
>
> Regards
>
> On Mon, Oct 16, 2017 at 4:56 PM, Bhavin Thaker <bhavinthaker@gmail.com>
> wrote:
>
> > I agree with Pedro.
> >
> > Based on various observations on unit test failures, I would like to
> > propose a few guidelines to follow for the unit tests. Even though I use
> > the word, “must” for my humble opinions below, please feel free to
> suggest
> > alternatives or modifications to these guidelines:
> >
> > 1) 1a) Each unit test must have a run time budget <= X minutes. Say, X =
> 2
> > minutes max.
> > 1b) The total run time budget for all unit tests <= Y minutes. Say, Y =
> 60
> > minutes max.
> >
> > 2) All Unit tests must have deterministic (not Stochastic) behavior. That
> > is, instead of using the random() function to test a range of input
> values,
> > each input test value must be carefully hand-picked to represent the
> > commonly used input scenarios. The correct place to stochastically test
> > random input values is to have continuously running nightly tests and NOT
> > the sanity/smoke/unit tests for each PR.
> >
> > 3) All Unit tests must be as much self-contained and independent of
> > external components as possible. For example, datasets required for the
> > unit test must NOT be present on external website which, if unreachable,
> > can cause test run failures. Instead, all datasets must be available
> > locally.
> >
> > 4) It is impossible to test everything in unit tests and so only common
> > use-cases and code-paths must be tested in unit-tests. Less common
> > scenarios like integration with 3rd party products must be tested in
> > nightly/weekly tests.
> >
> > 5) A unit test must NOT be disabled on a failure unless proven to exhibit
> > unreliable behavior. The burden-of-proof for a test failure must be on
> the
> > PR submitter and the PR must NOT be merged without a opening a new github
> > issue explaining the problem. If the unit test is disabled for some
> reason,
> > then the unit test must NOT be removed from the unit tests list; instead
> > the unit test must be modified to add the following lines at the start of
> > the test:
> >     Print(“Unit Test DISABLED; see GitHub issue: NNNN”)
> >     Exit(0)
> >
> > Please suggest modifications to the above proposal such that we can make
> > the unit tests framework to be the rock-solid foundation for the active
> > development of Apache MXNet (Incubating).
> >
> > Regards,
> > Bhavin Thaker.
> >
> >
> > On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <
> pedro.larroy.lists@gmail.com
> > >
> > wrote:
> >
> > > Hi
> > >
> > > Some of the unit tests are extremely costly in terms of memory and
> > compute.
> > >
> > > As an example in the gluon tests we are loading all the datasets.
> > >
> > > test_gluon_data.test_datasets
> > >
> > > Also running huge networks like resnets in test_gluon_model_zoo.
> > >
> > > This is ridiculously slow, and straight impossible on some embedded /
> > > memory constrained devices, and anyway is making tests run for longer
> > than
> > > needed.
> > >
> > > Unit tests should be small, self contained, if possible pure (avoiding
> > this
> > > kind of dataset IO if possible).
> > >
> > > I think it would be better to split them in real unit tests and
> extended
> > > integration test suites that do more intensive computation. This would
> > also
> > > help with the feedback time with PRs and CI infrastructure.
> > >
> > >
> > > Thoughts?
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message