lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dawid Weiss (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LUCENE-3492) Extract a generic framework for running randomized tests.
Date Tue, 11 Oct 2011 13:07:11 GMT

    [ https://issues.apache.org/jira/browse/LUCENE-3492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125001#comment-13125001
] 

Dawid Weiss commented on LUCENE-3492:
-------------------------------------

A word of warning: this will be a longer comment. I still hope somebody will read it ;)

I've written a somewhat largish chunk of code that provides an infrastructure to run "randomized",
but "repeatable" tests. I'd like to report on my impressions so far.

Robert was right that a custom runner provides more flexibility than a @Rule on top of the
default JUnit runner (which changes depending where you run it -- ant, maven, Eclipse, etc.).
I've spent a lot of time inspecting the current implementation inside JUnit and I came to
the conclusion that it really is best to have a full reimplementation of the Runner interface.
 Full meaning not descending ParentRunner, but implementing the whole runner from scratch.
This provides additional, uhm, unexpected benefits in that one can add new functionality that
"regular" JUnit runners don't have and _still_ be compatible with hosting environments such
as Ant, Maven or Eclipse (because they, thank God, respect @RunWith). 

Among the things I have implemented so far that are missing or different in JUnit are:
- There is a "context" object which is accessible via thread local, so @BeforeClass and other
suite-level hooks can actually access the suite class, inspect it, check conditions, whatever
(the runner's random seed is also passed via this context). This is useful, but not crucial.
- I've decided to deviate from JUnit strict policy of having public hook methods. By default
this only causes headaches when one shadows or overrides a hook in the parent class and it
is no longer invoked. A better (different) idea is to declare hooks as private; no shadowing
occurs and they will all get invoked in a contractual predefined order (befores - super to
class, afters - class to super).
- I've added additional suite-level annotations. @Listeners provides listeners automatically
hooked to RunListener. @Validators hooks up additional validators for verifying extra restrictions.
An example of such a restriction is bailing out the test suite if shadowed or overridden methods
exist in the class hierarchy of a suite class. Another (that I have implemented) is a validator
checking for non-annotated testXXX methods that are dead JUnit3 test cases. You get the idea.
A lot of code then simply vanishes from LTC; I can envision it having this shape:
{code}
@Listeners({
  StandardErrorInfoRunListener.class})
@Validators({
  NoHookMethodShadowing.class,
  NoTestMethodOverrides.class,
  NoJUnit3TestMethods.class})
public abstract class LuceneTestCase extends RandomizedTest {
  ...
}
{code}
Some of these things are currently verified using a state machine (calling super() in overridden
methods), but this just looks better to me to take away this concern elsewhere rather than
implement it inside LTC.
- The entire lifecycle of handling test method calls and hooks is controlled in the runner.
I made a design decision to _not_ follow JUnit's insane wrap-wrap-wrap-exception style but
instead report all exceptions that happen anywhere in the lifecycle. So if you get an exception
in the test case, followed by an exception in @After, followed by an exception in @AfterClass,
all these exceptions will be reported separately to the RunListener and in effect to all listening
objects (in the lifecycle-corresponding order!). Such an implementation does work with fine
with ANT JUnit reports, maven reports and in Eclipse (all exceptions are included) so far
as I can tell -- didn't check other environments like NetBeans or IntelliJ. Again: in my personal
opinion this is a much clearer way of dealing with exceptions in the lifecycle of JUnit test
case compared to wrapping them into artificial exceptions (MultipleException being a supreme
example) or suppressing them altogether.
- I couldn't resist a tiny tweak of making any exceptions thrown from hooks or test methods
carry the information about the seed used in their execution (both runner-level and method-level,
even though the latter could be derived from the former).  There is no easy way to do it because
Throwables are designed  not to allow changes to their content once constructed. With the
exception of stack traces :) So I simply inject a debugging info inside the stack trace as
an artificial entry; what it looks like is here, for instance:
{noformat}
java.lang.Error: Blah blah exception message.
	at __randomizedtesting.SeedInfo.seed([60BDF6E574486C2:60BDF6E76C930BC]:0)
	at […].examples.TestStackAugmentation$Nested.testMethod1(TestStackAugmentation.java:29)
{noformat}
(Note how the seed info is inside the file position of StackTraceEntry object.). This may
seem like overly clever solution, but I've had it many times that sysouts got discarded or
lost somehow and an exception object along with the stack trace is always there in front of
your eyes. Another way to capture-and-dump reproduction info is to use @Listeners annotation
above; this can be used for much what LTC does today -- -D…, -D…, -D...
- A custom runner can have custom implementation of the contractual "events", such as assumptions
or ignore triggers. This takes away a lot of code related to trying to get around JUnit's
API limitations (assume without message/cause, method filtering and dynamic ignores based
on extra conditions like @Nightly, etc.).

In short: I'm really happy with a custom Runner.

As for the infrastructure for writing randomized test cases:
- There is currently one "master" seed that the runner either generates randomly or accepts
as a global constant. Everything else: method shuffling, initial random instance for each
test case (method repetition)… really everything is based on sequential calls to this generator.
This has advantages and disadvantages I guess (read about static initializers below), but
it was my personal desire to implement it this way and based on my few days' worth of experience
with this code, it works great.
- I've written a base class RandomizedTest that extends Assert and has a number of utility
methods for picking random numbers or objects from collections. There is no passing of explicit
_Random_ instances around like it is done currently in LTC though. The base class accesses
the context's Random (which it is assigned by the runner) and then uses this random consistently
to generate pseudo-randomness in selection of attributes and iterations. Of course once you
go multi-threaded this will all go to dust, but I imagine multi-threaded tests shouldn't use
the base class's randomness (a test case based on race conditions won't be repeatable anyway).
If anything, generate per-thread Randoms based on current seed and let each thread handle
its own sequence of pseudo-random numbers from there. This is even possible at runtime with
non-mock objects as I'm going to show in Barcelona, hopefully.


Now… if you're still with me you're probably interested how this applies to Lucene. The
wall I've hit is the sheer amount of code that any change to LTC affects. I realized it'd
be large, but it's just gargantuan :) 

The major issue is with static initializers and static public methods called from them that
leave resources behind. I'm sorry, but nobody can convince me this isn't evil. I understand
certain things are costly and require a one-time setup, but these should really be moved to
@BeforeClass fixture hooks. If one really needs to do things once at JVM lifespan level a
@BeforeClass with some logic to perform a single initialization can be a replacement for a
static initializer (even if it's unclear to me when exactly such a fixture would be really
needed). In short: the problem with static initializers is that they are executed outside
the lifecycle control of the runner… I'd say most of the problems and current patchy solutions
inside LTC (dealing with resource tracking for example) are somehow related to the fact that
static initializers and static method calls are used throughout the codebase. 

I am currently wondering if it's feasible to provide a single patch that will make a drop-in
replacement of LTC. It may be the case that adding another skeleton class based on the "new"
infrastructure and rewriting tests one by one to use it may be a more sensitive/ sensible
way to go. 

The runner (alone) is currently at github if you care to take a look. I think Barcelona may
be a good place to talk about this face to face and decide what to do with it. I'm myself
leaning towards the: have parallel base classes and port existing tests in chunks.

                
> Extract a generic framework for running randomized tests.
> ---------------------------------------------------------
>
>                 Key: LUCENE-3492
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3492
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: general/test
>            Reporter: Dawid Weiss
>            Assignee: Dawid Weiss
>            Priority: Minor
>             Fix For: 4.0
>
>         Attachments: Screen Shot 2011-10-06 at 12.58.02 PM.png
>
>
> I love the idea of randomized testing. Everyone (we at CarrotSearch, Lucene and Solr
folks) have their glue to make it possible. The question is if there's something to pull out
that others could share without having the need to import Lucene-specific classes.
> The work on this issue is on my github account (lots of experiments):
> https://github.com/dweiss/randomizedtesting
> Or directly: git clone git://github.com/dweiss/randomizedtesting.git

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message