harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Oliver Deakin <oliver.dea...@googlemail.com>
Subject Re: [classlib][testing] Size of test input files
Date Thu, 17 Jan 2008 11:34:19 GMT
Sian January wrote:
> Hi Tim,
> On 16/01/2008, Tim Ellison <t.p.ellison@gmail.com> wrote:
> <SNIP!>
> At the moment I am using Harmony modules, but I'm assuming that apart from
> the space issue it would be ok to use any Apache licenced code.
>>> Also if I generate source material I'm only
>>> testing things I've thought of (which is more likely to work anyway
>> since I
>>> wrote the code).  Using large real-life programs tests things that I
>> haven't
>>> thought of and should also mean that the test coverage will be much
>> higher.
>> I can imagine you doing this locally to test the code, then generating
>> your own examples to put in the test suite to get good coverage.
> Yes - it would make sense to do some testing locally and not contribute it
> all, especially if the code base isn't going to change that much.  I do
> think it's still of value to have some real-life examples in the test suite
> though, so I suppose I'm just trying to get a feel for how much is
> appropriate.

I would think a combination of both approaches would be good.

We want to ensure that we have good code coverage from the tests. 
Downloading a set of real-world jars does not necessarily guarantee that 
you will have good coverage and this is where designing your own test 
archives is important. Even if you know the code yourself, creating 
tests to the spec (and not the code) should ensure that future 
regressions are detected, and will probably turn up bugs as you are 
developing the code. The issue with real-world archives is that they are 
variable - this is both a good and bad thing. While it provides 
constantly evolving test cases, and may expose code paths you had never 
thought of testing previously, it does not provide consistent code coverage.

If you feel that knowing the code well is an issue, then perhaps develop 
test archives to the spec up front for new functionality you are adding. 
This way once the functionality implementation is complete you have a 
set of tests ready to check its correctness, and you immediately have a 
regression suite prepared.

IMHO using a combination of "home-made" and real-world archives to 
ensure coverage and inject a little randomness would be a good idea.

>> While you will be writing these and testing the code you wrote, that's
>> not much different to you choosing the set of external archives to
>> include then ensuring your code passes with them -- i.e. you will ensure
>> your code works with the set you choose, which won't be exhaustive either.
> I disagree here.  I think a week (or month etc) testing with different
> existing code would produce much better coverage and a larger variety of
> test cases than spending the same amount of time writing a code generator
> would.  However I'll admit I don't know that much about generating code, so
> maybe I'm missing something here...

Unless you can ensure the generated code and archives give good enough 
code coverage (I imagine this would be quite hard to do), I would 
probably not use code generation for all tests.

I might approach this by doing the following:
 - Create a test suite of small archives which target particular pieces 
of pack200 functionality (written to the spec). In addition to these, 
add some implementation specific tests if this is possible.
 - To test larger more complex archives, use a combination of home-made, 
real-word and generated archives. This might be overkill (I don't think 
the code generation would be easy) and possibly only one or two of these 
approaches would be necessary.


>> Regards,
>> Tim
> Thanks,
> Sian

Oliver Deakin
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number 741598. 
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU

View raw message