Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2BB96100A2 for ; Fri, 5 Apr 2013 15:54:16 +0000 (UTC) Received: (qmail 9423 invoked by uid 500); 5 Apr 2013 15:54:15 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 9358 invoked by uid 500); 5 Apr 2013 15:54:15 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 9347 invoked by uid 99); 5 Apr 2013 15:54:15 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 15:54:15 +0000 Received: from localhost (HELO mail-vb0-f51.google.com) (127.0.0.1) (smtp-auth username apurtell, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 05 Apr 2013 15:54:15 +0000 Received: by mail-vb0-f51.google.com with SMTP id x19so2297537vbf.10 for ; Fri, 05 Apr 2013 08:54:14 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.52.31.103 with SMTP id z7mr7310408vdh.56.1365177253979; Fri, 05 Apr 2013 08:54:13 -0700 (PDT) Received: by 10.58.128.170 with HTTP; Fri, 5 Apr 2013 08:54:13 -0700 (PDT) In-Reply-To: References: Date: Fri, 5 Apr 2013 08:54:13 -0700 Message-ID: Subject: Re: Disable flaky tests, and let Jenkin stay blue? From: Andrew Purtell To: "dev@hbase.apache.org" Content-Type: multipart/alternative; boundary=bcaec51ddd49479e2804d99f1947 --bcaec51ddd49479e2804d99f1947 Content-Type: text/plain; charset=ISO-8859-1 It would seem the point of this exercise is to segregate and eventual fix _existing_ flaky tests. I don't see anyone advocating allowing commit of new known flaky tests. I also don't see anyone advocating ignoring flaky tests once segregated. Therefore I'm not sure how to provide an opinion on these questions, they don't seem to be grounded in the discussion we are having here. Please correct me if I am mistaken. On Friday, April 5, 2013, Ted Yu wrote: > Some questions I have is w.r.t. new feature which introduces flaky test(s). > > Should presence of such test(s) affect the vote for integration of the new > feature ? > Should we spend more effort on such flaky test(s) ? > > > On Wed, Apr 3, 2013 at 11:14 AM, Jimmy Xiang > > wrote: > > > HBASE-8256 was filed. We can discuss it further on the Jira if > interested. > > > > Thanks, > > Jimmy > > > > On Tue, Apr 2, 2013 at 10:50 AM, Nicolas Liochon > > > > wrote: > > > > > I'm between +0 and -0.5 > > > +0 because I like green status. They help to detect regression. > > > > > > -0.5 because > > > - If we can't afford to fix it now I guess it won't be fixed in the > > > future: we will continue to keep it in the codebase (i.e. paying the > > cost > > > of updating it when we change an interface), but without any added > value > > as > > > we don't run it. > > > - some tests failures are actually issues in the main source code. > Ok, > > > they're often minor, but still they are issues. Last example I have is > > from > > > today: the one found by Jeff related HBASE-8204. > > > - and sometimes it shows lacks in the way we test (for example, the > > > waitFor stuff, while quite obvious in a way, was added only very > > recently). > > > - often a flaky test is better than no test at all: they can still > > > detect regressions. > > > - I also don't understand why the precommit seems to be now better > > than > > > the main build. > > > > > > For me, doing it in a case by case way would be simpler (using the > > > component owners: it a test on a given component is flaky, the decision > > can > > > be taken between the people who want to remove the test and the > component > > > owners, with a jira, an analysis and a traced decision) > > > > > > Cheers, > > > > > > Nicolas > > > > > > > > > > > > On Tue, Apr 2, 2013 at 7:09 PM, Jimmy Xiang > > wrote: > > > > > > > We have not seen couple blue Jenkin builds for 0.95/trunk for quite > > some > > > > time. Because of this, sometimes we ignore the precommit build > > failures, > > > > which could let some bugs (code or test) sneaked in. > > > > > > > > I was wondering if it is time to disable all flaky tests and let > Jenkin > > > > stay blue. We can maintain a list of tests disabled, and get them > back > > > > once they are fixed. For each disabled test, if someone wants to get > it > > > > back, please file a jira so that we don't duplicate the effort and > work > > > on > > > > the same one. > > > > > > > > As to how to define a test flaky, to me, if a test fails twice in the > > > last > > > > 10/20 runs, then it is flaky, if there is no apparent env issue. > > > > > > > > We have different Jenkins job for hadoop 1 and hadoop 2. If a test > is > > > > flaky for either one, it is flaky. > > > > > > > > What do you think? > > > > > > > > Thanks, > > > > Jimmy > > > > > > > > > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --bcaec51ddd49479e2804d99f1947--