cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject [HEADS-UP] writing components for cache efficiency
Date Tue, 13 May 2003 15:23:38 GMT
Hi folks,

Lately, I've been looking at the operations happening in the 
TraxTransformer because of very poor performance. First, on this 
particular component, I found that a speed increase ranging from 1000% 
to 5000% depending on server load (yes, 10 to 50 times faster !) on 
"helloworld.html" can be achieved by changing 2 lines in cocoon.xconf :
- set "use-store" parameter to "true" on XSLT processors,
- set "use-persistent-cache" parameter to "false" on cache-transient.

I remember some discussions about Xalan memory leaks that led to setting 
"use-store" to false, but this doesn't seem do happen with XSLTC (and 
maybe with the newest Xalan ?)

So far so good. Now looking at TraxTransformer.setup(), we can see that 
it gets the validity _and_ and a TransformerHandler at the same time. 
And if the validity is still valid, the TransformerHandler is simply not 
used since the content is retrieved from the cache. So I hacked a bit to 
separate these, and obtained again a speed increase ranging from 5% to 30%.

This leads to some importants recommendations in order to achieve the 
maximum cache efficiency : *the setup() method must avoid performing 
operations that are necessary only if the content is not cached*. 
Otherwise, its just a waste of speed to deliver cached content.

Here's a reminder of the various steps that occur when handling a request :
1 - the sitemap is executed, meaning we create a pipeline object, and 
pipeline components : generator, transformers, and serializer.
2 - the setup() method of all pipeline components is called (except 
serializer which doesn't have one)
3 - the getKey() method of all pipeline components is called

Knowing the key, the pipeline can get the associated cache entry and its 
validity. If the cache validity either is invalid or needs a fresh 
validity object to be compared with, then :
4 - the getValidity() method of all pipeline components is called

The pipeline can then know if the cache entry is valid. If it's valid, 
it delivers the cached content. If it's invalid, then :
5 - the pipeline is connected. This means setXMLConsumer() is called on 
transformers and setOutputStream() is called on the serializer
6 - the generator's generate() method is called, and starts the SAX 
stream processing, resulting first in startDocument() being called on 
transformers and serializer.

What we can see above, is that we must defer as much as possible to 
points 5 or 6 the creation or lookup of resources that are used to 
process the content. Doing it before is only waste of resources. And 
that's what TraxTransformer is doing : it creates a TransformerHandler 
at point 2.

We should be aware of that and inspect cacheable components for possible 
enhancements. This is key for an increased performance of our beloved 
Cocoon !


Sylvain Wallez                                  Anyware Technologies 
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }

View raw message