From dev-return-1913-archive-asf-public=cust-asf.ponee.io@mxnet.incubator.apache.org Sun Jan 14 21:28:56 2018 Return-Path: X-Original-To: archive-asf-public@eu.ponee.io Delivered-To: archive-asf-public@eu.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by mx-eu-01.ponee.io (Postfix) with ESMTP id 46AAB180651 for ; Sun, 14 Jan 2018 21:28:56 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 36BCB160C43; Sun, 14 Jan 2018 20:28:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 5277C160C22 for ; Sun, 14 Jan 2018 21:28:55 +0100 (CET) Received: (qmail 92695 invoked by uid 500); 14 Jan 2018 20:28:54 -0000 Mailing-List: contact dev-help@mxnet.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mxnet.incubator.apache.org Delivered-To: mailing list dev@mxnet.incubator.apache.org Received: (qmail 92682 invoked by uid 99); 14 Jan 2018 20:28:54 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Jan 2018 20:28:54 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id C83901A070F for ; Sun, 14 Jan 2018 20:28:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.879 X-Spam-Level: * X-Spam-Status: No, score=1.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id 3w9rF6zCEdBq for ; Sun, 14 Jan 2018 20:28:49 +0000 (UTC) Received: from mail-qt0-f171.google.com (mail-qt0-f171.google.com [209.85.216.171]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 01DA15F306 for ; Sun, 14 Jan 2018 20:28:48 +0000 (UTC) Received: by mail-qt0-f171.google.com with SMTP id k19so11985168qtj.6 for ; Sun, 14 Jan 2018 12:28:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=2mgZGJV5TszTZFG1SBlpeOkVqM/llXTWsbU6CWkf+F0=; b=CqBN1P+NOZfuTMFrpMLhbA3YKC87NiJ711OLGEclEdx/ozN+gUn5WHwxDz+RruHdXO kTE4Sf6Y28x3JPlCgChbJGaaB4ascVu6u9Woj1HrE4UElXTiHeyTMk14ohBcJljtJb2b z+AAQms2h8n0kNGckSD2pGzvw2n3AxfiV1xH2LFuCnc6ufykCyHG9hF+Oj29nAGZRYbo wQgNzuzki3+2j6L1QwcVAkZjAZEjKXjZqT+wNusxZNVJ2mh62DOe7tlqXX9iqTUmuO6o Is9cWyVHwW/UfyNJYEBQ9Agn5l/eSEfcEJVmtPix8OxavICytyw1tKognKAaG6W8SPw/ 10YA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=2mgZGJV5TszTZFG1SBlpeOkVqM/llXTWsbU6CWkf+F0=; b=pSnxYW0M97fGpkytSIv4tF/4fm0UdhLnalXWg8A08ZCg/sUN50ThW0jlKtVD0ESKM4 BXc7olDiydP9Fyhu0YHK6aslB/d2hkvZwEiuj9HxNMU0JOJdPQpPBb7LxdDSd8jtIwyy MYueunes4jsGhkF3q3FPAJkrRTY8xU2ZwlZ/nKOWtrEk5VfnYLwXznWh3cpS2HahLmVX f06wUHxc6YgYqVe7ByM7QTFZW7FSEliqXu1NUistRPe8qHmWnfvnDf2+q8KHW9jHjCbe Zi3Vn81VJlHhe7bUifBCM4B6BEv/mukw6TBnHLvWyhfFqQZZ+SLiOBUzjVKAaB6EYol2 FMuw== X-Gm-Message-State: AKwxytcAqri2yOG1EGfSHkDpPrHR7cdt4SMBIYe8lVC9qdY+dp2AsX3T MSsmLgky2lK+QbKvPeaWAetVy3+/oTjAoPz+5nA= X-Google-Smtp-Source: ACJfBotyHn1GhBQ1TG9sIzYWmGN2AzooVaoJuqM9NYWVOXn7VZfa5zso/Fzkb6OVHfGiIWIfIk34+kQ01FoptnXs8BM= X-Received: by 10.200.7.74 with SMTP id k10mr3726497qth.333.1515961721769; Sun, 14 Jan 2018 12:28:41 -0800 (PST) MIME-Version: 1.0 Received: by 10.12.209.71 with HTTP; Sun, 14 Jan 2018 12:28:41 -0800 (PST) In-Reply-To: References: From: Bhavin Thaker Date: Sun, 14 Jan 2018 12:28:41 -0800 Message-ID: Subject: Re: Call for Help for Fixing Flaky Tests To: dev@mxnet.incubator.apache.org Cc: dev@mxnet.apache.org Content-Type: multipart/alternative; boundary="f403043a82cceb54a40562c2576b" --f403043a82cceb54a40562c2576b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi Sheng, I agree with doubling-down on the efforts to fix the flaky tests but do not agree with compromising the stability of the test automation. As a compromise, we could probably run the flaky tests as part of the nightly test automation -- would that work? I like your suggestion of using this: https://pypi.python.org/pypi/flaky in another email thread. May be we could have a higher rerun count as part of the nightly test to have better test automation stability. Bhavin Thaker. On Sun, Jan 14, 2018 at 12:21 PM, Sheng Zha wrote: > Hi Bhavin, > > Thanks for sharing your thoughts. Regarding the usage of 'flaky' plugin > for retrying flaky tests, it's proposed as a compromise, given that it wi= ll > take time to properly fix the tests and we still need coverage in the > meantime. > > I'm not sure if releasing before these tests are re-enabled should be the > way, as it's not a good practice to release features that are not covered > by tests. Having done it before doesn't make it right. In that sense, > release efforts shouldn't be a blocker for re-enabling tests. Rather, it > should be the other way around, and release should happen only after we > recover the lost test coverage. > > I hope that we would do the right thing for our users. Thanks. > > -sz > > On 2018-01-14 11:00, Bhavin Thaker wrote: > > Hi Sheng, > > > > Thank you for your efforts and this proposal to improve the tests. Here > are > > my thoughts. > > > > Shouldn=E2=80=99t the focus be to _engineer_ each test to be reliable i= nstead of > > compromising and discussing the relative tradeoffs in re-enabling flaky > > tests? Is the test failure probability really 10%? > > > > As you correctly mention, the experiences in making the tests reliable > will > > then serve as the standard for adding new tests rather than continuing = to > > chase the elusive goal of reliable tests. > > > > Hence, my non-binding vote is: > > -1 for proposal #1 for renabling flaky tests. > > +1 for proposal #2 for setting the standard for adding reliable tests. > > > > I suggest to NOT compromise on the quality and reliability of the tests= , > > similar to the high bar maintained for the MXNet source code. > > > > If the final vote is to re-enable flaky tests, then I propose that we > > enable them immediately AFTER the next MXNet release instead of doing i= t > > during the upcoming release. > > > > Bhavin Thaker. > > > > On Sat, Jan 13, 2018 at 2:20 PM, Marco de Abreu < > > marco.g.abreu@googlemail.com> wrote: > > > > > Hello Sheng, > > > > > > thanks a lot for leading this task! > > > > > > +1 for both points. Additionally, I'd propose to add the requirement = to > > > specify a reason if a new test takes more than X seconds (say 10) or > adds > > > an external dependency. > > > > > > Looking forward to getting these tests fixed :) > > > > > > Best regards, > > > Marco > > > > > > On Sat, Jan 13, 2018 at 11:14 PM, Sheng Zha > wrote: > > > > > > > Hi MXNet community, > > > > > > > > Thanks to the efforts of several community members, we identified > many > > > > flaky tests. These tests are currently disabled to ensure the smoot= h > > > > execution of continuous integration (CI). As a result, we lost > coverage > > > on > > > > those features. They need fixing and to be re-enabled to ensure the > > > quality > > > > of our releases. I'd like to propose the following: > > > > > > > > 1, Re-enable flaky python tests with retries if feasible > > > > Although the tests are unstable, they would still be able to catch > > > breaking > > > > changes. For example, suppose a test fails randomly with 10% > probability, > > > > the probability of three failed retries become 0.1%. On the other > hand, a > > > > breaking change would result in 100% failure. Although this could > > > increase > > > > the testing time, it's a compromise that can help avoid bigger > problem. > > > > > > > > 2, Set standard for new tests > > > > I think having criteria that new tests should follow can help > improve the > > > > quality of tests, but also the quality of code. I propose the > following > > > > standard for tests. > > > > - Reliably passing with good coverage > > > > - Avoid randomness unless necessary > > > > - Avoid external dependency unless necessary (e.g. due to license) > > > > - Not resource-intensive unless necessary (e.g. scaling tests) > > > > > > > > In addition, I'd like to call for volunteers on helping with the fi= x > of > > > > tests. New members are especially welcome, as it's a good > opportunity to > > > > familiarize with MXNet. Also, I'd like to request that members who > wrote > > > > the feature/test could help either by fixing, or by helping others > > > > understand the issues. > > > > > > > > The effort on fixing the tests is tracked at: > > > > https://github.com/apache/incubator-mxnet/issues/9412 > > > > > > > > Best regards, > > > > Sheng > > > > > > > > > > --f403043a82cceb54a40562c2576b--