Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 8C810200D2E for ; Mon, 16 Oct 2017 18:01:28 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 8B2C21609EC; Mon, 16 Oct 2017 16:01:28 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id D55CF1609EF for ; Mon, 16 Oct 2017 18:01:27 +0200 (CEST) Received: (qmail 69393 invoked by uid 500); 16 Oct 2017 16:01:27 -0000 Mailing-List: contact dev-help@mxnet.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mxnet.incubator.apache.org Delivered-To: mailing list dev@mxnet.incubator.apache.org Received: (qmail 69323 invoked by uid 99); 16 Oct 2017 16:01:26 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 16 Oct 2017 16:01:26 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id ECB411A2860 for ; Mon, 16 Oct 2017 16:01:25 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id ovRiNSlOK-vr for ; Mon, 16 Oct 2017 16:01:23 +0000 (UTC) Received: from mail-qk0-f182.google.com (mail-qk0-f182.google.com [209.85.220.182]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6F3D45FACE for ; Mon, 16 Oct 2017 16:01:23 +0000 (UTC) Received: by mail-qk0-f182.google.com with SMTP id k123so13439775qke.3 for ; Mon, 16 Oct 2017 09:01:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=TJLsyyz7vJdBatKqfYNXYnS+OVNWXad2KgOJ+O0a0vY=; b=CjclizCFK17I3WoEd/RpzvKtmSh/hl3g8b0kGIE7FXb115gnFLKTRIaZk7DA9Fr54O cUtWbpKbpnBaz9hmkd/sBFhhkaEZ3bohMgxiOsWa5s71FsnJwrevQV12OaQ3a8L5ICUC 04I2hm8hhSa3nn8s317tv2z83n29os9nQsvZ+teNW0LP+9PzsCFrMzLyPnRwt/hj5Pwk nYEiUNP8Cse5m5hEn3bV57eZHJ5RSima4Icak7trg8OB5JnO9gmfs/LGOP8iUJLk4UT8 hiu1ge0RaxiQGtmnnr2aeREbXdc6dTMXH7kp62xb9UWuEHBjt1CfBFzlJgV4miWKKFb0 NinQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=TJLsyyz7vJdBatKqfYNXYnS+OVNWXad2KgOJ+O0a0vY=; b=bhtzSYC7YxZd7zC2BIWo0/JoDAoIoNqR3V2QOyJyZ9qqnj6fBYJvwbs0RUAKQA+G3f wZ/2mGykD4h+XSS3lIrptbHq3HxBYJHdTfkr8JMz01KIwwBxnoQxZYyDVccXdJKVl+t+ mS3AoRTmgw0zT/RtUj7EcOakx2byxMauLCPHhaxxvvgBOtNvC5l2F4Fb84fsaZuPLfzg 1YLb5kTNAJ3z/t6u5f+Nn1oILf96XoX/JQF7ShG0lT0fqA3+JJbRYF6IdYEWI+4HoPYZ Vci2YPqZcdMi3FofH4Ue+jsLK7zed1RhGrHxHmwRXMsA5ydrBeYpcWeX1WcnBh1gFJLG g+iQ== X-Gm-Message-State: AMCzsaV7MoI9HjKHvpuP1ULhQ91gN7ktbq6gAp+73texwjlynl+lPQgk 2323W0zWcoUFVp4o6yk+MuUcWmtw3Rs7ZJqDlCI= X-Google-Smtp-Source: ABhQp+QoGvHIBff8fe5F+aGuhaLdXQOmiA9Du8/zh3k5BSh3Z+e51EEdnVIt8lRs06El4WuJPOh4hvQUrMRhV/G7fYc= X-Received: by 10.55.96.134 with SMTP id u128mr14723012qkb.290.1508169675680; Mon, 16 Oct 2017 09:01:15 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Bhavin Thaker Date: Mon, 16 Oct 2017 16:01:05 +0000 Message-ID: Subject: Re: Improving and rationalizing unit tests To: Pedro Larroy , dev@mxnet.incubator.apache.org Content-Type: multipart/alternative; boundary="94eb2c05436ac7b463055bac1dac" archived-at: Mon, 16 Oct 2017 16:01:28 -0000 --94eb2c05436ac7b463055bac1dac Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable For the randomness argument, I am more concerned of a unit test that exhibits different behaviors for different runs. Stochastic test, IMHO, is not a good sanity test, because the code entry Quality bar is stochastic rather than deterministic =E2=80=94 causing a lot of churn for diagnosing U= nit test failures for PRs. There are other places (nightly) to do extensive tests. PR-unit-tests are sanity tests and must be quick, reliable and consistent for every PR. Bhavin Thaker. On Mon, Oct 16, 2017 at 8:51 AM Pedro Larroy wrote: > That's not true. random() and similar functions are based on a PRNG. It > can be debugged and it's completely deterministic, a good practice is to > use a known seed for this. > > More info: https://en.wikipedia.org/wiki/Pseudorandom_number_generator > > On Mon, Oct 16, 2017 at 5:42 PM, pracheer gupta < > pracheer_gupta@hotmail.com> wrote: > >> @Chris: Any particular reason for -1? Randomness just prevents in writin= g >> tests that you can rely on and/or debug later on in case of failure. >> >> On Oct 16, 2017, at 8:28 AM, Chris Olivier > cjolivier01@gmail.com>> wrote: >> >> -1 for "must not use random numbers for input" >> >> On Mon, Oct 16, 2017 at 7:56 AM, Bhavin Thaker > > > > >> wrote: >> >> I agree with Pedro. >> >> Based on various observations on unit test failures, I would like to >> propose a few guidelines to follow for the unit tests. Even though I use >> the word, =E2=80=9Cmust=E2=80=9D for my humble opinions below, please fe= el free to suggest >> alternatives or modifications to these guidelines: >> >> 1) 1a) Each unit test must have a run time budget <=3D X minutes. Say, X= =3D 2 >> minutes max. >> 1b) The total run time budget for all unit tests <=3D Y minutes. Say, Y = =3D 60 >> minutes max. >> >> 2) All Unit tests must have deterministic (not Stochastic) behavior. Tha= t >> is, instead of using the random() function to test a range of input >> values, >> each input test value must be carefully hand-picked to represent the >> commonly used input scenarios. The correct place to stochastically test >> random input values is to have continuously running nightly tests and NO= T >> the sanity/smoke/unit tests for each PR. >> >> 3) All Unit tests must be as much self-contained and independent of >> external components as possible. For example, datasets required for the >> unit test must NOT be present on external website which, if unreachable, >> can cause test run failures. Instead, all datasets must be available >> locally. >> >> 4) It is impossible to test everything in unit tests and so only common >> use-cases and code-paths must be tested in unit-tests. Less common >> scenarios like integration with 3rd party products must be tested in >> nightly/weekly tests. >> >> 5) A unit test must NOT be disabled on a failure unless proven to exhibi= t >> unreliable behavior. The burden-of-proof for a test failure must be on t= he >> PR submitter and the PR must NOT be merged without a opening a new githu= b >> issue explaining the problem. If the unit test is disabled for some >> reason, >> then the unit test must NOT be removed from the unit tests list; instead >> the unit test must be modified to add the following lines at the start o= f >> the test: >> Print(=E2=80=9CUnit Test DISABLED; see GitHub issue: NNNN=E2=80=9D) >> Exit(0) >> >> Please suggest modifications to the above proposal such that we can make >> the unit tests framework to be the rock-solid foundation for the active >> development of Apache MXNet (Incubating). >> >> Regards, >> Bhavin Thaker. >> >> >> On Mon, Oct 16, 2017 at 5:56 AM Pedro Larroy < >> pedro.larroy.lists@gmail.com > > >> >> wrote: >> >> Hi >> >> Some of the unit tests are extremely costly in terms of memory and >> compute. >> >> As an example in the gluon tests we are loading all the datasets. >> >> test_gluon_data.test_datasets >> >> Also running huge networks like resnets in test_gluon_model_zoo. >> >> This is ridiculously slow, and straight impossible on some embedded / >> memory constrained devices, and anyway is making tests run for longer >> than >> needed. >> >> Unit tests should be small, self contained, if possible pure (avoiding >> this >> kind of dataset IO if possible). >> >> I think it would be better to split them in real unit tests and extended >> integration test suites that do more intensive computation. This would >> also >> help with the feedback time with PRs and CI infrastructure. >> >> >> Thoughts? >> >> >> --94eb2c05436ac7b463055bac1dac--