Mailing-List: contact harmony-dev-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: harmony-dev@incubator.apache.org
Received-SPF: pass (herse.apache.org: domain of oleg.oleinik@gmail.com
 designates 66.249.92.169 as permitted sender)
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws;
        s=beta; d=gmail.com;
        h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:references;
        b=bNkWBE232CKYBFcqz5giIJMctsyK464ll5OsgE7Nc1No5oWTgfxgx8V9eTHk9whMzz6d/gdl5NohfHen6yAEq9lrwaXQPCRIX0N0FrQe93mTu/i4sPZYsLboTULcjbMe9S/f7GnXGDf5No5vJcJvD57Ai5YD+3GVRTBBHhtz//A=
Message-ID: <ecd185710611080215uca373a1ye9f5b0afb4ef305d@mail.gmail.com>
Date: Wed, 8 Nov 2006 16:15:03 +0600
From: "Oleg Oleinik" <oleg.oleinik@gmail.com>
To: harmony-dev@incubator.apache.org
Subject: Re: [DRLVM] General stability
In-Reply-To: <u1woe5t7z.fsf@gmail.com>
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_Part_16751_27761160.1162980903274"
References: <4dd1f3f00611031342k3c7464d4w18c9354b3662eba1@mail.gmail.com>
	 <4550FD1D.40205@gmail.com>
	 <ecd185710611072159k6687eebbj4940df6d6ffbc9ad@mail.gmail.com>
	 <u1woe5t7z.fsf@gmail.com>

------=_Part_16751_27761160.1162980903274
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

> "no regression" policy should be relevant to a number of *small* tests
that are easy to run and are running fast, to make them good as pre-commit
criteria.

Actually, I'm thinking about the following model (which goes a little bit
beyond pre-commit testing):

**Unit testing: new feature is developed; unit (or whatever) tests are
created; the tests are passing on certain platforms / runtime
configurations; the feature along with tests goes into JIRA.

**Pre-commit testing: done by committer with agreed set of tests, typically,
quickly running (example: classlib unit tests + vm/jit-specific tests).
Regression? - no commit or exclude tests / ignore failures, if reasonable
and agreed.

**Code integrity testing: done automatically ~hourly, the same set of tests
as for pre-commit testing may be used. Regression? - notify and fix asap (or
even roll-back changes if appropriate) or just exclude tests, if reasonable
and agreed.

**QA testing (say, nightly): one runs automated workloads (from
buildtest/) on his platform(s) nightly (or from time to time). Regression? -
for example, 3 EUT tests or some Eclipse scenario started failing after
certain commit - notify harmony-dev about regression, then, it should
be decided whether or not stop new commits and fix the regression asap or
accept regression and just exclude tests.
Bugfixer can take automation scripts from buildtest/ and play with failing
tests or scenario.

I think, the more we care about regressions on ongoing basis, the less time
we will need to achieve milestone's stability requirements, the more sense
have applications enabling acitivities.

Who knows - 2 months is sufficient for reaching established stability level
or not?
Why enable something if tommorow it has good chances do not work?

> Many successful projects (probably, all of them) have stability periods,
even stability releases (and, yes, stability branches). That is considered
effective. And IMO our project should act the same.

I support having milestones, releases. But, after milestone I don't
like seeing loosing achieved results.


On 08 Nov 2006 12:48:48 +0600, Egor Pasko <egor.pasko@gmail.com> wrote:
>
> On the 0x21B day of Apache Harmony Oleg Oleinik wrote:
> > Such model works but there is a risk of fixing again "from scratch"
> those
> > bugs which were fixed once on the previous milestones.
>
> sometimes it is easier to fix a couple of bugs "from scratch" than to
> spend large amount of resources on regular complex checks (that also
> do not guarantee 100% stability)
>
> > We can eliminate this if follow "no regression" policy - if something
> works
> > (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for
> example),
> > it should continue working - any regression is a subject for reporting
> and
> > fixing as soon as possible (it is easier to find root cause and fix
> since we
> > will know which commit caused regression).
> >
> > Will this model work? Isn't it a little bit better than focusing on
> runtime
> > stability periodically?
>
> "no regression" policy should be relevant to a number of *small* tests
> that are easy to run and are running fast, to make them good as
> pre-commit criteria.
>
> Complex workloads _cannot_ be run as a pre-commit criteria. So there
> _should be regressions_. That's because:
> * we cannot afford to run them as pre-commit
> * we cannot afford complex rollbacks and stop-the world
>
> Many successful projects (probably, all of them) have stability
> periods, even stability releases (and, yes, stability branches). That
> is considered effective. And IMO our project should act the same.
>
> We _have to_ allow some bugs to continue active development. But not
> too many. It is always a tradeoff.
>
> To summarize. I support you idea to improve the regression test base
> and infrastructure. Let it be a step-by-step improvement. Then we can
> decide which tests to run as pre-commit and which are to measure the
> overall stability.
>
> > On 11/8/06, Tim Ellison <t.p.ellison@gmail.com> wrote:
> > >
> > > I wouldn't go so far as to label issues as "won't fix" unless they are
> > > really high risk and low value items.
> > >
> > > It's useful to go through a stabilization period where the focus is on
> > > getting the code solid again and delaying significant new
> functionality
> > > until it is achieved.  A plan that aims to deliver stable milestones
> on
> > > regular periods is, in my experience, a good way to focus the
> > > development effort.
> > >
> > > Regards,
> > > Tim
> > >
> > > Weldon Washburn wrote:
> > > > Folks,
> > > >
> > > > I have spent the last two months committing patches to the
> VM.  While we
> > > > have added a ton of much needed functionality, the stability of the
> > > system
> > > > has been ignored.  By chance, I looked at thread synchronization
> design
> > > > problems this week.  Its very apparent that  we lack the regression
> > > testing
> > > > to really find threading bugs, test the fixes and test against
> > > regression.
> > > > No doubt there are similar problems in other VM subsystems.   "build
> > > test"
> > > > is necessary but not sufficient for where we need to go.  In a
> sense,
> > > > committing code with only "build test" to prevent regression is the
> > > > equivalent to flying in the fog without instrumentation.
> > > >
> > > > So that we can get engineers focused on stability, I am thinking of
> > > coding
> > > > the JIRAs that involve new features as "later" or even "won't
> > > fix".  Please
> > > > feel free to comment.
> > > >
> > > > We also need to restart the old email threads on regression
> tests.  For
> > > > example, we need some sort of automated test script that runs
> Eclipse
> > > and
> > > > tomcat, etc. in a deterministic fashion so that we can compare test
> > > > results.  It does not have to be perfect for starts, just repeatable
> and
> > > > easy to use.  Feel free to beat me to starting these threads :)
> > > >
> > >
> > > --
> > >
> > > Tim Ellison (t.p.ellison@gmail.com)
> > > IBM Java technology centre, UK.
> > >
>
> --
> Egor Pasko
>
>

------=_Part_16751_27761160.1162980903274--