Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: 
 <CALte62wJ8BV18D5CPgktYxFhsXN6xoUD=+PsuVsT+y9cAoBf_g@mail.gmail.com>
References: 
 <CAOmV22vO1PLJ7WQ24B9X5A8Dy4xOLUmxvon=uCRzbdfCr_tKwg@mail.gmail.com>
	<CAPcDmStGcTDGZkYU04dO3spjpPUo2z-naw838MBFLrR67WTncA@mail.gmail.com>
	<CAOmV22vCa7nBi-bvfpnmfsJ2iXruD5S9y9JrOnD=pt-e1ZsLMQ@mail.gmail.com>
	<CALte62wJ8BV18D5CPgktYxFhsXN6xoUD=+PsuVsT+y9cAoBf_g@mail.gmail.com>
Date: Fri, 5 Apr 2013 08:54:13 -0700
Message-ID: 
 <CA+RK=_BUy_qjNXKuMTdP47ttAJeGE3bh6N8OFjrB4OGq7wqo6g@mail.gmail.com>
Subject: Re: Disable flaky tests, and let Jenkin stay blue?
From: Andrew Purtell <apurtell@apache.org>
To: "dev@hbase.apache.org" <dev@hbase.apache.org>
Content-Type: multipart/alternative; boundary=bcaec51ddd49479e2804d99f1947

--bcaec51ddd49479e2804d99f1947
Content-Type: text/plain; charset=ISO-8859-1

It would seem the point of this exercise is to segregate and eventual fix
_existing_ flaky tests. I don't see anyone advocating allowing commit of
new known flaky tests. I also don't see anyone advocating ignoring flaky
tests once segregated. Therefore I'm not sure how to provide an opinion
on these questions, they don't seem to be grounded in the discussion we are
having here. Please correct me if I am mistaken.

On Friday, April 5, 2013, Ted Yu wrote:

> Some questions I have is w.r.t. new feature which introduces flaky test(s).
>
> Should presence of such test(s) affect the vote for integration of the new
> feature ?
> Should we spend more effort on such flaky test(s) ?
>
>
> On Wed, Apr 3, 2013 at 11:14 AM, Jimmy Xiang <jxiang@cloudera.com<javascript:;>>
> wrote:
>
> > HBASE-8256 was filed.  We can discuss it further on the Jira if
> interested.
> >
> > Thanks,
> > Jimmy
> >
> > On Tue, Apr 2, 2013 at 10:50 AM, Nicolas Liochon <nkeywal@gmail.com<javascript:;>
> >
> > wrote:
> >
> > > I'm between +0 and -0.5
> > > +0 because I like green status. They help to detect regression.
> > >
> > > -0.5 because
> > >    - If we can't afford to fix it now I guess it won't be fixed in the
> > > future:  we will continue to keep it in the codebase (i.e. paying the
> > cost
> > > of updating it when we change an interface), but without any added
> value
> > as
> > > we don't run it.
> > >    - some tests failures are actually issues in the main source code.
> Ok,
> > > they're often minor, but still they are issues. Last example I have is
> > from
> > > today: the one found by Jeff related HBASE-8204.
> > >     - and sometimes it shows lacks in the way we test (for example, the
> > > waitFor stuff, while quite obvious in a way, was added only very
> > recently).
> > >     - often a flaky test is better than no test at all: they can still
> > > detect regressions.
> > >     - I also don't understand why the precommit seems to be now better
> > than
> > > the main build.
> > >
> > > For me, doing it in a case by case way would be simpler (using the
> > > component owners: it a test on a given component is flaky, the decision
> > can
> > > be taken between the people who want to remove the test and the
> component
> > > owners, with a jira, an analysis and a traced decision)
> > >
> > > Cheers,
> > >
> > > Nicolas
> > >
> > >
> > >
> > > On Tue, Apr 2, 2013 at 7:09 PM, Jimmy Xiang <jxiang@cloudera.com<javascript:;>>
> wrote:
> > >
> > > > We have not seen couple blue Jenkin builds for 0.95/trunk for quite
> > some
> > > > time.  Because of this, sometimes we ignore the precommit build
> > failures,
> > > > which could let some bugs (code or test) sneaked in.
> > > >
> > > > I was wondering if it is time to disable all flaky tests and let
> Jenkin
> > > > stay blue.  We can maintain a list of tests disabled, and get them
> back
> > > > once they are fixed. For each disabled test, if someone wants to get
> it
> > > > back, please file a jira so that we don't duplicate the effort and
> work
> > > on
> > > > the same one.
> > > >
> > > > As to how to define a test flaky, to me, if a test fails twice in the
> > > last
> > > > 10/20 runs, then it is flaky, if there is no apparent env issue.
> > > >
> > > > We have different Jenkins job for hadoop 1 and hadoop 2.  If a test
> is
> > > > flaky for either one, it is flaky.
> > > >
> > > > What do you think?
> > > >
> > > > Thanks,
> > > > Jimmy
> > > >
> > >
> >
>


-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

--bcaec51ddd49479e2804d99f1947--