Mailing-List: contact dev-help@harmony.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@harmony.apache.org
Received-SPF: pass (athena.apache.org: domain of t.p.ellison@gmail.com
 designates 72.14.214.230 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=gmail.com; s=beta;
        h=received:message-id:date:from:user-agent:mime-version:to:subject:references:in-reply-to:x-enigmail-version:content-type:content-transfer-encoding;
        b=IO99mvACUAm7neyR0sdZlkxCrQ191g/ewwZ2iUn2qSG+FTw6O3CK0FT7reZKHNSOLdwAOZidRwUyskVv4YINskPXiuSv4Ek0qdwmDOUDYzMiFG1rfBCRhQbswYlGu8uosi0pPBp6astaWbhypZJBb8Xh0eq7EMQSrv6s0KqTS3M=
Message-ID: <46CEAB27.6080707@gmail.com>
Date: Fri, 24 Aug 2007 10:55:51 +0100
From: Tim Ellison <t.p.ellison@gmail.com>
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To: dev@harmony.apache.org
Subject: Re: [general] M3 milestone discussion
References: <6e47b64f0708120602q1e9a9c7fvf786b64a4d3ee991@mail.gmail.com>
	 <4dd1f3f00708121935m4b8ef591i9492d0242db432e5@mail.gmail.com>
	 <46CB4BF7.8020107@gmail.com>
	 <906dd82e0708212116y688ab795odff3df2df9fa1b02@mail.gmail.com>
	 <46CC11F7.1000502@gmail.com>
	 <906dd82e0708220609s2f242ba5pc20e42c398c408e1@mail.gmail.com>
	 <46CD709E.6090106@gmail.com>
 <906dd82e0708230536l1fe8a898l64a7d3a601d02be2@mail.gmail.com>
In-Reply-To: <906dd82e0708230536l1fe8a898l64a7d3a601d02be2@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit

Mikhail Loenko wrote:
> 2007/8/23, Tim Ellison <t.p.ellison@gmail.com>:
>> Mikhail Loenko wrote:
>>> 2007/8/22, Tim Ellison <t.p.ellison@gmail.com>:
>>>> Mikhail Loenko wrote:
>>>>> Basing on M2 experience I think 2-mo is a too short for Harmony:
>>>>> 25% of the whole time we would have our workspace somehow frozen.
>>>>> And we couldn't shorten freeze time since we have long running
>>>>> suites and scenarios.
>>>> That's not my memory, looking back in the list you froze the code on 24
>>>> June, and unfroze it on 30 June.
>>> There was also "feature freeze" message on June, 14th. So it's not 10%.
>> Rather than get into a debate about the %'s, let's decide whether we
>> have the right balance between open development and ensuring
>> stability/demonstrating progress.
>>
>> I'm sure we agree that we would like to minimize the disruption on
>> on-going development, but agree that we need these stability
>> checkpoints.  This (thread) is the first time I see a call for longer
>> open development periods.
>>
>>> We need that length of time to run
>>>> tests and check stability as you mention, but it was more like 10% which
>>>> I think is reasonable given our current state.
>>>>
>>>>> IMHO it negatively affects progress of the project.
>>>>> So I'm +1 for having fixed schedule, but 2-mo schedule does not leave
>>>>> enough time for normal development
>>>> Can you explain what you mean here?  I see lots of 'normal development'
>>>> taking place, with hundreds of commits in each milestone.
>>> We declare that our milestone builds are "best so far". That actually mean
>>> that we should not have (at least known) regressions.
>> Agreed.
>>
>>> We have a huge amount of tests and it's impossible to run them all
>>> before each commit. For that reason many commits introduce regressions.
>> Well hopefully not 'many commits' but it is a possibility yes<g>
>>
>>> Now the question is what %% of time we may focus on development of new
>>> features vs time on fixing regressions. Basing on CC results, it might take
>>> up to 2-3 weeks to fix regressions introduced by a commit (some scenarios are
>>> down for even longer time).
>> Is that because people are not looking at the CC results and fixing
>> them, or that we are short of machines to crunch through the scenarios?
> 
> the more machines the better. Currently BTI scenarios run on ~30 machines,
> but still it may take up to a week to notice a regression, the reasons are:
> 
> we have many long-running scenarios
> some failures are intermittent and thus not necessary regressions
> some failures caused by side effects (e.g. we have tests that read/write files)
> there are failures that are not reproducible when a single test is run
> (they reproducible only when the whole suite runs)
> and more
> 
> Tim has mentioned how many commits we make, so it takes time to identify
> guilty commit and find a reason of regression...
> 
> Well it does not always takes that long to fix the regression, but still
> it's not a 5-minute task.

Doesn't that imply that we check stability more often, rather than let
the side-effects build up over a longer period of time?

>>> So that actually mean that in the 2-mo schedule
>>> we may do full-swing development during ~1 month, do very careful development
>>> 2 more weeks, and be mostly blocked 2 remaining weeks.
>> If we have introduced regressions, then fixing them in those two weeks
>> would seem like a good idea rather than continued open development.  How
>> long do you think is a reasonable time to let regressions ride?
> 
> for short cycle tests (like classlib, drlvm-test) it should be hours.
> For long-running
> scenarios one week is "OK". But sometimes it may take more...
> 
> The problem is we have more than one developer :)

Indeed, so while I have sympathy for Mikhail F' saying that his work
pace may need to adjust to the timing of a milestone, it needs to be set
in the context of everyone else's work affecting him and him affecting
everyone else.

> If there is a single person working on the code and he sees a regression,
> he might stop and fix it.
> 
> If we have two people, A and B, working on
> area1 and area2 and e.g. A has introduced a regression into area1, so that
> for example scenarioX now fails then the question is should B stop his
> work until
> area1 is fixed?
> 
> If it's "hacking time" then B should probably continue development,
> if it's "stabilizing time" then B should probably stop and wait until
> it's fixed.

Agreed, and we don't want to leave it unfixed for too long.

>>> This is what I see in VM, API is definitely different: most changes are rather
>>> isolated.
>> We can certainly tweak the current practice if people feel it is
>> inhibiting the progress they could be making, I just want to ensure we
>> are not trading stability for more hacking :-)
> 
> I think we should maintain stability even when we are hacking. If a new feature
> introduced regressions we should fix them even if it's not "milestone time"

Agreed.

> So IMHO we should base our decision on
> - what ratio between "open development" and "constrained development"
> we'd like to have
> - what is mean time to repair regressions = R
> - how long is our full testing cycle = T
> 
> then milestone shedule would ideally be something like:
> T for code freeze
> R + T for feature freeze
> 
> and period would be
> (R + 2T)/(constrained%)
> adjusted by:
> - how often we think community wants to see stable builds
> 
> ;)

I agree with all that, and if somebody thinks that two months' work is
not enough then let them propose an alternative.  It works for me but we
should seek concensus.

Regards,
Tim