cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Giacomo Pati <>
Subject Re: [FYI] Profiling Cocoon...
Date Sun, 06 Oct 2002 20:14:39 GMT
On Sun, 6 Oct 2002, Stefano Mazzocchi wrote:

> Hello people,
> I'm currently at Giacomo's place and we spent a rainy afternoon
> profiling the latest Cocoon to see if there is something we could
> fix/improve/blah-blah.
> WARNING: this is *by no means* a scientific report. But we have tried to
> be as informative as possible for developers.
> We were running Tomcat 4.1.10 + Cocoon HEAD on Sun JDK 1.4.1-b21 on
> linux, instrumented with Borland OptimizeIt 4.2.
> Here is what we discovered:
> 1) Regarding memory leaks, Cocoon seems absolutely clean (for cocoon, we
> mean org.apache.cocoon.* classes). Avalon seems to be clean as well.
> Good job everyone.
> 2) we noticed an incredible use of
> org.apache.avalon.excalibur.collections.BucketMap$Node. It is *by far*
> the most used class in the heap. More than Strings, byte[], char[] and
> int[]. Some 140000 instances of that class.
> The number of bucketmap nodes grows linearly with the amount of
> different pages accessed (as they are fed into the cache), but even a
> cached resource creates some 44 new nodes, which are later garbage
> collected.
> 44 is nothing compared to 140000, but still something to investigate.
> So, discovery #1:
>     BucketMaps are used *a lot*. Be aware of this.
> 3) Catalina seems to be spending 10% of the pipeline time. Having
> extensively profiled and carefully optimized a servlet engine (JServ) I
> can tell you that this is *WAY* too much. Catalina doesn't seem like the
> best choice to run a loaded servlet-based site (contact
> if you want to do something about it: he's working on Jerry, a
> super-light servlet engine based on native APR and targetted expecially
> for Apache 2.0)
> 4) java IO takes something from 20% to 35% of the entire request time
> (reading and writing from the socket). This could well be a problem with
> the instrumented JVM since I don't think the JDK 1.4 is that slow on IO
> (expecially using the new NIO facilities internally)
> 5) most of the time is spent on:
>    a) XSLT processing (and we knew that)
>    b) DTD parsing (and that was surprise for me!)
> Yeah, DTD parsing. No, not for validation, but for entity resolution. It
> seems that even if the parser is non-validated, the DTD is fully parsed
> anyway just to do entity evalutation.
> So, discovery #2:
>     Be careful about DTDs even if the parser is not validating.
> Of course, when the cache kicks in and the cached document is read
> directly from the compiled SAX events, we have an incredible speed
> improvement (also because entities are already resolved and hardwired).
> 6) Xalan incremental seems to be a little slower than regular Xalan, but
> on multiprocessing machines this might not be the case [Xalan uses two
> threads for incremental processing]
> NOTE: Xalan doesn't pool threads when it does that!
> So, while perceived performance is better for Xalan in incremental mode,
>   the overall load of the machine is reduced if Xalan is used normally.
> 7) XSLTC *IS* blazingly fast compared to Xalan and is much less resource
> intensive.
> Discovery #3:
>   use XSLTC as much as possible!
> NOTE: our current root sitemap.xmap indicates that XSLTC is default XSLT
> engine for Cocoon 2.1, but the fact is that the XSLTC factory is
> commented out, resulting in running Xalan. We should either remove that
> comment or uncomment the XSLTC factory.
> I vote for making XSLTC default even if this generates a few bug reports.



> 8) Cocoon's hotspot is.... drum roll.... URI matching.
> TreeProcessor is complex and adds lots of complexity to the call stacks,
> but it seems to be very lightweight. It's URI matching that is the thing
> that needs more work performance-wise.
> Don't get me wrong, my numbers indicate that URI matching takes for 3%
> to 8% of response time. Compared to the rest is nothing, but since this
> is the only thing we are in total control, this is where we should
> concentrate profiling efforts.
> Ok, that's it. Enough for a rainy swiss afternoon.
> Anyway, Cocoon is pretty optimized for what we could see. So let's be
> happy about it.

To unsubscribe, e-mail:
For additional commands, email:

View raw message