mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From pracheer gupta <pracheer_gu...@hotmail.com>
Subject Re: Improving and rationalizing unit tests
Date Mon, 16 Oct 2017 16:00:06 GMT
That’s true Pedro. I assumed, in this particular context, when we say “random” numbers
we mean random numbers which have not been explicitly seeded which make the intermittently
failing unit tests hard to reproduce.

On Oct 16, 2017, at 8:51 AM, Pedro Larroy <pedro.larroy.lists@gmail.com<mailto:pedro.larroy.lists@gmail.com>>
wrote:

That's not true. random() and similar functions are based on a PRNG. It can
be debugged and it's completely deterministic, a good practice is to use a
known seed for this.

More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator

On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta <pracheer_gupta@hotmail.com<mailto:pracheer_gupta@hotmail.com>>
wrote:

@Chris: Any particular reason for -1? Randomness just prevents in writing
tests that you can rely on and/or debug later on in case of failure.

On Oct 16, 2017, at 8:28 AM, Chris Olivier <cjolivier01@gmail.com<mailto:cjolivier01@gmail.com><mailto:
cjolivier01@gmail.com<mailto:cjolivier01@gmail.com>>> wrote:

-1 for "must not use random numbers for input"

On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker <bhavinthaker@gmail.com<mailto:bhavinthaker@gmail.com><
mailto:bhavinthaker@gmail.com>>
wrote:

I agree with Pedro.

Based on various observations on unit test failures, I would like to
propose a few guidelines to follow for the unit tests. Even though I use
the word, “must” for my humble opinions below, please feel free to suggest
alternatives or modifications to these guidelines:

1) 1a) Each unit test must have a run time budget <= X minutes. Say, X = 2
minutes max.
1b) The total run time budget for all unit tests <= Y minutes. Say, Y = 60
minutes max.

2) All Unit tests must have deterministic (not Stochastic) behavior. That
is, instead of using the random() function to test a range of input values,
each input test value must be carefully hand-picked to represent the
commonly used input scenarios. The correct place to stochastically test
random input values is to have continuously running nightly tests and NOT
the sanity/smoke/unit tests for each PR.

3) All Unit tests must be as much self-contained and independent of
external components as possible. For example, datasets required for the
unit test must NOT be present on external website which, if unreachable,
can cause test run failures. Instead, all datasets must be available
locally.

4) It is impossible to test everything in unit tests and so only common
use-cases and code-paths must be tested in unit-tests. Less common
scenarios like integration with 3rd party products must be tested in
nightly/weekly tests.

5) A unit test must NOT be disabled on a failure unless proven to exhibit
unreliable behavior. The burden-of-proof for a test failure must be on the
PR submitter and the PR must NOT be merged without a opening a new github
issue explaining the problem. If the unit test is disabled for some reason,
then the unit test must NOT be removed from the unit tests list; instead
the unit test must be modified to add the following lines at the start of
the test:
  Print(“Unit Test DISABLED; see GitHub issue: NNNN”)
  Exit(0)

Please suggest modifications to the above proposal such that we can make
the unit tests framework to be the rock-solid foundation for the active
development of Apache MXNet (Incubating).

Regards,
Bhavin Thaker.


On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <pedro.larroy.lists@gmail.com<mailto:pedro.larroy.lists@gmail.com>
<mailto:pedro.larroy.lists@gmail.com>

wrote:

Hi

Some of the unit tests are extremely costly in terms of memory and
compute.

As an example in the gluon tests we are loading all the datasets.

test_gluon_data.test_datasets

Also running huge networks like resnets in test_gluon_model_zoo.

This is ridiculously slow, and straight impossible on some embedded /
memory constrained devices, and anyway is making tests run for longer
than
needed.

Unit tests should be small, self contained, if possible pure (avoiding
this
kind of dataset IO if possible).

I think it would be better to split them in real unit tests and extended
integration test suites that do more intensive computation. This would
also
help with the feedback time with PRs and CI infrastructure.


Thoughts?



Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message