db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mike Matrigali <mikem_...@sbcglobal.net>
Subject Re: Regression Test Harness handling of "duration" field
Date Wed, 08 Feb 2006 20:06:29 GMT
tracking test time is nice, but remember the point of tests
is not performance measurement.  Having nothing else it
is reasonable to look at cases as described below - but
we shouldn't fool ourselves that it a great way to measure
performance regression.  As has been reported on the list
tests tend to do one thing once and any sort of outside
influence of the machine or other processes can easily
skew numbers (true of any performance measurement of course).

It would be better if we had some sort of performance
regression test suite.  Personally I like 2 flavors of
such a beast.  One is a very directed test that is more
to measure pieces of the system sort of like unit testing
rather than model a user application.  Cloudscape had
such a beast but was not donated as it contained a lot
of customer based data which we could not donate.  It also
sort of grew like the current test harness, so it may be
better to start fresh rather than port the code.

The other flavor is standardized open source benchmarks.  It
looks like some contributers on the list are working with
TPC like benchmarks.  These test the whole system and are
good for system regression testing, but it hard work for a
developer to go from this test is 10% slower to what line
of code caused it.

I am just wondering if this is a problem that Junit or some
other standard opensource harness can handle rather than
creating yet another harness in derby.  I don't really know
much about Junit.  The features I would like from such a harness are:

o control a init/cleanup routine per test
o control a init/cleanup routine per thread in test
o control number of threads
o control number of iterations of test
o control number of repeats of iterations of test
o collect elapsed time of each of the pieces (test, per user, and 
overall test).
o allow for properties to be passed in to control test behavior

So you could write a simple insert test and then with the same
implementation try out:
o 1000 inserts
o 10 runs of 1000 inserts
o 1000 inserts, 10 users
o 10 runs of 1000 inserts, 10 users

extra credit:
o collect  I/O stats (don't think there is 100% pure java way)
o collect system vs. user time

Bryan Pendleton wrote:
> David W. Van Couvering wrote:
>> My understanding it's the wall clock time. 
>>>   derbyall   630 6 624 0
>>>     Duration   45.6%
> OK. So it's saying that this particular run of 'derbyall'
> took 46.2% of the wall clock time that 'derbyall' took
> on August 2, 2005.
> That makes sense.
> Given that this particular run failed badly, we probably
> don't care about the Durations, then.
> But in general, we'd probably want to keep our eyes open
> for Duration values that started to get significantly
> higher than 100%, because that would mean that we might
> have accidentally introduced a performance regression.
> Perhaps we could have some sort of trigger, so that if a
> suite experienced a duration of, say, 150%, that was
> treated as a regression failure, even if all the tests in
> that suite passed?
> thanks,
> bryan

View raw message