harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Leo Simons <m...@leosimons.com>
Subject Re: [classlib] Testing
Date Wed, 22 Mar 2006 13:50:33 GMT
On Wed, Mar 22, 2006 at 08:02:44AM -0500, Geir Magnusson Jr wrote:
> Leo Simons wrote:
> >On Wed, Mar 22, 2006 at 07:15:28AM -0500, Geir Magnusson Jr wrote:
> >>Pulling out of the various threads where we have been discussing, can we 
> >>agree on the problem :
> >>
> >>We have unique problems compared to other Java projects because we need 
> >>to find a way to reliably test the things that are commonly expected to 
> >>be a solid point of reference - namely the core class library.
> >>
> >>Further, we've been implicitly doing "integration testing" because - so 
> >>far - the only way we've been testing our code has been 'in situ' in the 
> >>VM - not in an isolated test harness.  To me, this turns it into an 
> >>integration test.
> >>
> >>Sure, we're using JUnit, but because of the fact we are implmenting core 
> >>java.* APIs, we aren't testing with a framework that has been 
> >>independently tested for correctness, like we would when testing any 
> >>other code.
> >>
> >>I hope I got that idea across - I believe that we have to go beyond 
> >>normal testing approaches because we don't have a normal situation.
> >
> >Where we define 'normal situation' as "running a test framework on top of
> >the sun jdk and expecting any bugs to not be in that jdk". There's plenty
> >of projects out there that have to test things without having such a
> >"stable reference JDK" luxury.....I imagine that testing GCC is just as
> >hard as this problem we have here :-)
> 
> Is it the same?  We need to have a running JVM+classlibarary to test the 
> classlibrary code.

Well you need a working C compiler and standard C library to compile the
compiler so you can compile make so you can build bash so you can run
perl (which uses the standard C library functions all over the place of
course) so you can run the standard C library tests so that you know that
the library you used when compiling the compiler were correct so you can
run the compiler tests. I don't think they actually do things that way, but
it seems like basically the same problem. Having a virtual machine just
makes it easier since you still assume "the native world" as a baseline,
which is a lot more than "the hardware".

> >>So I think there are three things we want to do (adopting the 
> >>terminology that came from the discussion with Tim and Leo ) :
> >>
> >>1) implementation tests
> >>2) spec/API tests (I'll bundle together)
> >>3) integration/functional tests
> >>
> >>I believe that for #1, the issues related to being on the bootclasspath 
> >>don't matter, because we aren't testing that aspect of the classes 
> >>(which is how they behave integrated w/ the VM and security system) but 
> >>rather the basic internal functioning.
> >>
> >>I'm not sure how to approach this, but I'll try.  I'd love to hear how 
> >>Sun, IBM or BEA deals with this, or be told why it isn't an issue :)
> >>
> >>Implementation tests : I'd like to see us be able to do #1 via the 
> >>standard same-package technique (i.e. testing a.b.C w/ a.b.CTest) but 
> >>we'll run into a tangle of classloader problems, I suspect, becuase we 
> >>want to be testing java.* code in a system that already has java.* code. 
> >> Can anyone see a way we can do this - test the classlibrary from the 
> >>integration point of view - using some test harness + any known-good 
> >>JRE, like Sun's or IBM's?
> >
> >Ew, that won't work in the end since we should assume our own JRE is going
> >to be "known-better" :-). But it might be a nice way to "bootstrap" (eg
> >we test with an external JRE until we satisfy the tests and then we switch
> >to testing with an earlier build).
> 
> Lets be clear - even using our own "earlier build" doesn't solve the 
> problem I'm describing, because as it stands now, we don't use "earlier 
> build" classes to test with - we use the code we want to test as the 
> clsaslibrary for the JRE that's running the test framework.
> 
> The classes that we are testing are also the classes used by the testing 
> framework.  IOW, any of the java.* classes that JUnit itself needs (ex. 
> java.util.HashMap) are exactly the same implementation that it's testing.
> 
> That's why I think it's subtly different than a "bootstrap and use 
> version - 1 to test" problem.  See what I mean?

Yeah yeah, I was already way beyond thinking "just" JUnit is usable for the
kind of test you're describing. At some point, fundamentally, you either trust
something external (whether its the sun jdk or the intel compiler designers,
at some point you do draw a line) or you find a way to bootstrap.

> I'm very open to the idea that I'm missing something here, but I'd like 
> to know that you see the issue - that when we test, we have
> 
>   VM + "classlib to be tested" + JUnit + testcases
> 
> where the testcases are testing the classlib the VM is running JUnit with.
> 
> There never is isolation of the code being tested :
> 
>   VM + "known good classlib" + Junit + testcases
> 
> unless we have some framework where
> 
>   VM + "known good classlib" + JUnit
>       + framework("classlib to be tested")
>            + testcases
> 
> and it's that notion of "framework()" that I'm advocating we explore.

I'm all for exploring it, I just fundamentally don't buy into the "known
good" bit. What happens when the 'classlib to be tested' is 'known
better' than the 'known good' one? How do you define "known"? How do you
define "good"?

> >Further ideas...
> >
> >-> look at how the native world does testing
> >   (hint: it usually has #ifdefs, uses perl along the way, and it is 
> >   certainly
> >    "messy")
> >   -> emulate that
> >
> >-> build a bigger, better specification test
> >   -> and somehow "prove" it is "good enough"
> >
> >-> build a bigger, better integration test
> >   -> and somehow "prove" it is "good enough"
> >
> >I'll admit my primary interest is the last one...
> 
> The problem I see with the last one is that the "parameter space" is *huge*.

Yeah, that's one of the things that makes it interesting. Fortunately
open source does have many monkeys...

> I believe that your preference for the last one comes from the 
> Monte-Carlo style approach that Gump uses - hope that your test suite 
> has enough variance that you "push" the thing being tested through 
> enough of the parameter space that you can be comfortable you would have 
> exposed the bugs.  Maybe.

Ooh, now its becoming rather abstract...

Well, perhaps, but more of the gump approache comes from the idea that
the parameter space itself is also at some point defined in software,
which may have bugs of its own. You circumvent that by making humans the
parameter space (don't start about how humans are buggy. We don't want to
get into existialism or faith systems when talking about unit testing do
we?). The thing that gump enables is "many monkey QA" - a way for thousands
of human beings to concurrently make shared assertions about software
without actually needing all that much human interaction.

More concretely, if harmony can run all known java software, and run it to
the asserted satisfaction of all its developers, you can trust that you have
covered all the /relevant/ parts of the parameter space you describe. You
will never get that level of trust when the assertions are made by software
rather than humans. This is how open source leads to software quality.

Quoting myself, 'gump is the most misunderstood piece of software, ever'.

cheers,

Leo

Mime
View raw message