harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Harley <george.c.har...@googlemail.com>
Subject Re: [classlib] Testing
Date Thu, 23 Mar 2006 20:56:42 GMT
Geir Magnusson Jr wrote:
> Leo Simons wrote:
>> On Wed, Mar 22, 2006 at 08:02:44AM -0500, Geir Magnusson Jr wrote:
>>> Leo Simons wrote:
>>>> On Wed, Mar 22, 2006 at 07:15:28AM -0500, Geir Magnusson Jr wrote:
>>>>> Pulling out of the various threads where we have been discussing, 
>>>>> can we agree on the problem :
>>>>> We have unique problems compared to other Java projects because we 
>>>>> need to find a way to reliably test the things that are commonly 
>>>>> expected to be a solid point of reference - namely the core class 
>>>>> library.
>>>>> Further, we've been implicitly doing "integration testing" because 
>>>>> - so far - the only way we've been testing our code has been 'in 
>>>>> situ' in the VM - not in an isolated test harness.  To me, this 
>>>>> turns it into an integration test.
>>>>> Sure, we're using JUnit, but because of the fact we are 
>>>>> implmenting core java.* APIs, we aren't testing with a framework 
>>>>> that has been independently tested for correctness, like we would 
>>>>> when testing any other code.
>>>>> I hope I got that idea across - I believe that we have to go 
>>>>> beyond normal testing approaches because we don't have a normal 
>>>>> situation.
>>>> Where we define 'normal situation' as "running a test framework on 
>>>> top of
>>>> the sun jdk and expecting any bugs to not be in that jdk". There's 
>>>> plenty
>>>> of projects out there that have to test things without having such a
>>>> "stable reference JDK" luxury.....I imagine that testing GCC is 
>>>> just as
>>>> hard as this problem we have here :-)
>>> Is it the same?  We need to have a running JVM+classlibarary to test 
>>> the classlibrary code.
>> Well you need a working C compiler and standard C library to compile the
>> compiler so you can compile make so you can build bash so you can run
>> perl (which uses the standard C library functions all over the place of
>> course) so you can run the standard C library tests so that you know 
>> that
>> the library you used when compiling the compiler were correct so you can
>> run the compiler tests. I don't think they actually do things that 
>> way, but
>> it seems like basically the same problem. Having a virtual machine just
>> makes it easier since you still assume "the native world" as a baseline,
>> which is a lot more than "the hardware".
> There's a difference.  You can use a completely separate toolchain to 
> build, test and verify the output of the C compiler.
> In our case, we are using the thing we are testing to test itself. 
> There is no "known good" element possible right now.
> We use the classlibrary we are trying to test to execute the test 
> framework that tests the classlibrary that is running it.
> The tool is testing itself.
>>>>> So I think there are three things we want to do (adopting the 
>>>>> terminology that came from the discussion with Tim and Leo ) :
>>>>> 1) implementation tests
>>>>> 2) spec/API tests (I'll bundle together)
>>>>> 3) integration/functional tests
>>>>> I believe that for #1, the issues related to being on the 
>>>>> bootclasspath don't matter, because we aren't testing that aspect 
>>>>> of the classes (which is how they behave integrated w/ the VM and 
>>>>> security system) but rather the basic internal functioning.
>>>>> I'm not sure how to approach this, but I'll try.  I'd love to hear 
>>>>> how Sun, IBM or BEA deals with this, or be told why it isn't an 
>>>>> issue :)
>>>>> Implementation tests : I'd like to see us be able to do #1 via the 
>>>>> standard same-package technique (i.e. testing a.b.C w/ a.b.CTest) 
>>>>> but we'll run into a tangle of classloader problems, I suspect, 
>>>>> becuase we want to be testing java.* code in a system that already 
>>>>> has java.* code. Can anyone see a way we can do this - test the 
>>>>> classlibrary from the integration point of view - using some test 
>>>>> harness + any known-good JRE, like Sun's or IBM's?
>>>> Ew, that won't work in the end since we should assume our own JRE 
>>>> is going
>>>> to be "known-better" :-). But it might be a nice way to "bootstrap" 
>>>> (eg
>>>> we test with an external JRE until we satisfy the tests and then we 
>>>> switch
>>>> to testing with an earlier build).
>>> Lets be clear - even using our own "earlier build" doesn't solve the 
>>> problem I'm describing, because as it stands now, we don't use 
>>> "earlier build" classes to test with - we use the code we want to 
>>> test as the clsaslibrary for the JRE that's running the test framework.
>>> The classes that we are testing are also the classes used by the 
>>> testing framework.  IOW, any of the java.* classes that JUnit itself 
>>> needs (ex. java.util.HashMap) are exactly the same implementation 
>>> that it's testing.
>>> That's why I think it's subtly different than a "bootstrap and use 
>>> version - 1 to test" problem.  See what I mean?
>> Yeah yeah, I was already way beyond thinking "just" JUnit is usable 
>> for the
>> kind of test you're describing. At some point, fundamentally, you 
>> either trust
>> something external (whether its the sun jdk or the intel compiler 
>> designers,
>> at some point you do draw a line) or you find a way to bootstrap.
> Well, we do trust the Sun JDK.
>>> I'm very open to the idea that I'm missing something here, but I'd 
>>> like to know that you see the issue - that when we test, we have
>>>   VM + "classlib to be tested" + JUnit + testcases
>>> where the testcases are testing the classlib the VM is running JUnit 
>>> with.
>>> There never is isolation of the code being tested :
>>>   VM + "known good classlib" + Junit + testcases
>>> unless we have some framework where
>>>   VM + "known good classlib" + JUnit
>>>       + framework("classlib to be tested")
>>>            + testcases
>>> and it's that notion of "framework()" that I'm advocating we explore.
>> I'm all for exploring it, I just fundamentally don't buy into the "known
>> good" bit. What happens when the 'classlib to be tested' is 'known
>> better' than the 'known good' one? How do you define "known"? How do you
>> define "good"?
> Known?  Passed some set of tests. So it could be the Sun JDK for the 
> VM + "known good" part.
> I think you intuitively understand this.  When you find a bug in code 
> you are testing, you first assume it's your code, not the framework, 
> right?  In our case, our framework is actually the code we are 
> testing, so we have a bit of a logical conundrum.

Hi Geir,

The number of Harmony public API classes that get loaded just to run the 
JUnit harness is a little over 200. The majority of these are out of 
LUNI with a very low number coming from each of Security, NIO, Archive 
and Text.

Sure there is a circular dependency between what we are building and the 
framework we are using to test it but it appears to touch on only a 
relatively small part of Harmony....IMHO.

Best regards,

>>>> Further ideas...
>>>> -> look at how the native world does testing
>>>>   (hint: it usually has #ifdefs, uses perl along the way, and it is 
>>>>   certainly
>>>>    "messy")
>>>>   -> emulate that
>>>> -> build a bigger, better specification test
>>>>   -> and somehow "prove" it is "good enough"
>>>> -> build a bigger, better integration test
>>>>   -> and somehow "prove" it is "good enough"
>>>> I'll admit my primary interest is the last one...
>>> The problem I see with the last one is that the "parameter space" is 
>>> *huge*.
>> Yeah, that's one of the things that makes it interesting. Fortunately
>> open source does have many monkeys...
>>> I believe that your preference for the last one comes from the 
>>> Monte-Carlo style approach that Gump uses - hope that your test 
>>> suite has enough variance that you "push" the thing being tested 
>>> through enough of the parameter space that you can be comfortable 
>>> you would have exposed the bugs.  Maybe.
>> Ooh, now its becoming rather abstract...
>> Well, perhaps, but more of the gump approache comes from the idea that
>> the parameter space itself is also at some point defined in software,
>> which may have bugs of its own. You circumvent that by making humans the
>> parameter space (don't start about how humans are buggy. We don't 
>> want to
>> get into existialism or faith systems when talking about unit testing do
>> we?). The thing that gump enables is "many monkey QA" - a way for 
>> thousands
>> of human beings to concurrently make shared assertions about software
>> without actually needing all that much human interaction.
>> More concretely, if harmony can run all known java software, and run 
>> it to
>> the asserted satisfaction of all its developers, you can trust that 
>> you have
>> covered all the /relevant/ parts of the parameter space you describe. 
> Yes.  And when you can run all knownn Java software, let me know :) 
> That's my point about the parameter space being huge.  Even when you 
> reduce the definition to "that of all known Java software", you still 
> have a huge problem on your hands.
>> You
>> will never get that level of trust when the assertions are made by 
>> software
>> rather than humans. This is how open source leads to software quality.
>> Quoting myself, 'gump is the most misunderstood piece of software, 
>> ever'.
>> cheers,
>> Leo

View raw message