cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject Re: Weird multithreading bug in Cron block
Date Wed, 08 Jun 2005 21:46:58 GMT
Sylvain Wallez wrote:

> Hi all,
> I'm currently working on a publication application with complex 
> database queries where we want to prefetch some of the pages linked to 
> by the page currently being produced, in order to speed up response 
> time on pages that are likely to be asked for by users.
> To achieve this, we have a "PrefetchTransformer" that grabs elements 
> having a prefetch="true" attribute and starts a background job to load 
> the corresponding "src" or "href" URL using a "cocoon:".
> At first I used JobScheduler.fireJob() to schedule for immediate 
> execution, but went into *weird* bugs with strange NPEs all around in 
> pipeline components. After analysis, it appeared that while the 
> scheduler thread was processing the pipeline, the http thread was 
> recycling the *background environment*, thus nulling the object model 
> and other class attributes used by pipeline components.
> I spent the *whole day* trying to find the cause for this, without 
> success (how frustrating).
> Then I decided to try another approach and use 
> JobScheduler.fireJobAt(new Date()), meaning "schedule the job for 
> later execution... now!". And it worked!
> Weird, weird, weird! Anybody having a hint about why fireJob() is 
> doing this environment mixture?

Actually fireJobAt() is broken also when using another test case. 
Desperately searching for the cause, I went back to basics, i.e. "new 
Thread(runnable).start()". Also broken, but helped me to finally find 
the cause :-)

The problems lies in CocoonComponentManager.addForAutomaticRelease().

The environmentStack is a CloningInheritableThreadLocal. That means that 
when we create a new thread, it inherits the environment stack of its 
parent thread.

The result is that threads created by Cocoon *always* inherit an 
environment stack of at least size 1:
- in the cron block, that's the environment of the first http request, 
which created the Cocoon object
- for "new Thread()", that's the same as above, plus all sitemaps that 
we've been through when we create the thread.

Now let's look at InvokeContext.getProcessingPipeline() (in 
treeprocessor): if this is an internal request, the pipeline object is 
added for automatic release. I guess this is to avoid memory leaks if 
ever we forget to call resolver.release() on a sitemap source.

Following this path, let's go to 
CocoonComponentManager.addForAutomaticRelease(). The component that has 
to be autoreleased is added to a list attached to the *first* 
environment of the stack (because of "stack.get(0)"), and is therefore 
released when we exit this environment.

Now what happens when we create a thread that runs in the background? 
The end of processing of the *http* request releases pipeline objects of 
all child threads of the servlet engine's thread (the one which 
processes the http request). If the background thread uses a "cocoon:" 
URL and is currently executing the corresponding pipeline, recycle() is 
called on all pipeline components and bang, NPEs all around the place!!

And this leads to very random bugs: since servlet engines uses a thread 
pools, this erroneous pipeline release happens only when the servlet 
engine reuses the thread that intially loaded CocoonServlet. And NPEs 
happen when this first thread is used *and* a scheduler thread is 
executing a "cocoon:" pipeline. Weird...

So the question is:
- why does the environment stack have to be inherited by child threads? 
Is it to keep the current context? Then isn't inheriting the current 
processor and uri context enough?
- why is the pipeline automatically released? Is to avoid memory leaks?

Possible remedies would be to remove one of the above features, but I 
guest they're there for a reason.

So what about adding CocoonComponentManager.clearEnvironmentStack(), 
that could be called by CocoonQuartzJobExecutor, or even 
DefaultThreadFactory (in o.a.c.c.thread) before running the job?



Sylvain Wallez                        Anyware Technologies  
Apache Software Foundation Member     Research & Technology Director

View raw message