Mailing-List: contact dev-help@mxnet.incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@mxnet.incubator.apache.org
MIME-Version: 1.0
References: <CAC_CU1jN40mq_=Lz6DwEDm+5necYM-HjLtd3xinVTazCsjY5vQ@mail.gmail.com>
 <CAFdgvPFz+EmOgFZa2e7qLeugmzcGWoTCTZYFz_Le9eOrOoZ9pQ@mail.gmail.com>
 <CABgAAff5gVYh=w4kFEoVgmATUWrzvX+AQUfU7rkfj4Hp_Y0HWA@mail.gmail.com>
 <MWHPR20MB1584AA12831200979329E9C7FD4F0@MWHPR20MB1584.namprd20.prod.outlook.com>
 <CAC_CU1h-2PbVn4Ui5MFO3DsO1awJ2UN+5kNraUYSr7F33EwAFQ@mail.gmail.com>
In-Reply-To: <CAC_CU1h-2PbVn4Ui5MFO3DsO1awJ2UN+5kNraUYSr7F33EwAFQ@mail.gmail.com>
From: Bhavin Thaker <bhavinthaker@gmail.com>
Date: Mon, 16 Oct 2017 16:01:05 +0000
Message-ID: <CAFdgvPH_iefBMqyhsrEy5Bsm6Mz_wA9HaZzcGHT7dd5OBXKSng@mail.gmail.com>
Subject: Re: Improving and rationalizing unit tests
To: Pedro Larroy <pedro.larroy.lists@gmail.com>, dev@mxnet.incubator.apache.org
Content-Type: multipart/alternative; boundary="94eb2c05436ac7b463055bac1dac"
archived-at: Mon, 16 Oct 2017 16:01:28 -0000

--94eb2c05436ac7b463055bac1dac
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

For the randomness argument, I am more concerned of a unit test that
exhibits different behaviors for different runs. Stochastic test, IMHO, is
not a good sanity test, because the code entry Quality bar is stochastic
rather than deterministic =E2=80=94 causing a lot of churn for diagnosing U=
nit test
failures for PRs.

There are other places (nightly) to do extensive tests.

PR-unit-tests are sanity tests and must be quick, reliable and consistent
for every PR.

Bhavin Thaker.

On Mon, Oct 16, 2017 at 8:51 AM Pedro Larroy <pedro.larroy.lists@gmail.com>
wrote:

> That's not true. random() and similar functions are based on a PRNG. It
> can be debugged and it's completely deterministic, a good practice is to
> use a known seed for this.
>
> More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator
>
> On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta <
> pracheer_gupta@hotmail.com> wrote:
>
>> @Chris: Any particular reason for -1? Randomness just prevents in writin=
g
>> tests that you can rely on and/or debug later on in case of failure.
>>
>> On Oct 16, 2017, at 8:28 AM, Chris Olivier <cjolivier01@gmail.com<mailto=
:
>> cjolivier01@gmail.com>> wrote:
>>
>> -1 for "must not use random numbers for input"
>>
>> On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker <bhavinthaker@gmail.com
>> <mailto:bhavinthaker@gmail.com>>
>
>
>> wrote:
>>
>> I agree with Pedro.
>>
>> Based on various observations on unit test failures, I would like to
>> propose a few guidelines to follow for the unit tests. Even though I use
>> the word, =E2=80=9Cmust=E2=80=9D for my humble opinions below, please fe=
el free to suggest
>> alternatives or modifications to these guidelines:
>>
>> 1) 1a) Each unit test must have a run time budget <=3D X minutes. Say, X=
 =3D 2
>> minutes max.
>> 1b) The total run time budget for all unit tests <=3D Y minutes. Say, Y =
=3D 60
>> minutes max.
>>
>> 2) All Unit tests must have deterministic (not Stochastic) behavior. Tha=
t
>> is, instead of using the random() function to test a range of input
>> values,
>> each input test value must be carefully hand-picked to represent the
>> commonly used input scenarios. The correct place to stochastically test
>> random input values is to have continuously running nightly tests and NO=
T
>> the sanity/smoke/unit tests for each PR.
>>
>> 3) All Unit tests must be as much self-contained and independent of
>> external components as possible. For example, datasets required for the
>> unit test must NOT be present on external website which, if unreachable,
>> can cause test run failures. Instead, all datasets must be available
>> locally.
>>
>> 4) It is impossible to test everything in unit tests and so only common
>> use-cases and code-paths must be tested in unit-tests. Less common
>> scenarios like integration with 3rd party products must be tested in
>> nightly/weekly tests.
>>
>> 5) A unit test must NOT be disabled on a failure unless proven to exhibi=
t
>> unreliable behavior. The burden-of-proof for a test failure must be on t=
he
>> PR submitter and the PR must NOT be merged without a opening a new githu=
b
>> issue explaining the problem. If the unit test is disabled for some
>> reason,
>> then the unit test must NOT be removed from the unit tests list; instead
>> the unit test must be modified to add the following lines at the start o=
f
>> the test:
>>    Print(=E2=80=9CUnit Test DISABLED; see GitHub issue: NNNN=E2=80=9D)
>>    Exit(0)
>>
>> Please suggest modifications to the above proposal such that we can make
>> the unit tests framework to be the rock-solid foundation for the active
>> development of Apache MXNet (Incubating).
>>
>> Regards,
>> Bhavin Thaker.
>>
>>
>> On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy <
>> pedro.larroy.lists@gmail.com<mailto:pedro.larroy.lists@gmail.com>
>
>
>>
>> wrote:
>>
>> Hi
>>
>> Some of the unit tests are extremely costly in terms of memory and
>> compute.
>>
>> As an example in the gluon tests we are loading all the datasets.
>>
>> test_gluon_data.test_datasets
>>
>> Also running huge networks like resnets in test_gluon_model_zoo.
>>
>> This is ridiculously slow, and straight impossible on some embedded /
>> memory constrained devices, and anyway is making tests run for longer
>> than
>> needed.
>>
>> Unit tests should be small, self contained, if possible pure (avoiding
>> this
>> kind of dataset IO if possible).
>>
>> I think it would be better to split them in real unit tests and extended
>> integration test suites that do more intensive computation. This would
>> also
>> help with the feedback time with PRs and CI infrastructure.
>>
>>
>> Thoughts?
>>
>>
>>

--94eb2c05436ac7b463055bac1dac--