cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: benchmarking Cocoon?
Date Fri, 18 Apr 2003 20:36:40 GMT
on 4/19/03 9:48 AM Argyn wrote:


> Hi
> 
> Do you think there's a need for benchmarking Cocoon?

You mean benchmarking the internals or the results of standardized
pipelines?

> It's quite often when people are introduced to Cocoon, that they ask "isn't
> this supposed to be slow?" Those who work with Cocoon, probably, think that
> its overhead is relatively small, that the real problems are in good
> stylesheets, transformers etc., i.e. those things which have nothing to do
> with Cocoon itself. However, people consider Cocoon as a package. This whole
> thing which does the job. So, the performance concern is valid.

There are many things involved when a cocoon request is executed. the
servlet engine, for example, can delay the request dramatically. Try
this very simple test: get a URI download metering tool (ApacheBench or
JMeter) and try it again a non-cached cocoon resource. Then configure
cocoon to output the time took to generate the response as a comment.

if you compare the two, you'll see a *lot* of difference. this is given
by the HTTP transport + servlet engine and can sum up to several
hundreds of millis.

> I've been thinking about meaningfull becnhmark for Cocoon, a la TPC. So, I
> can run it, and get the number(s), and say "yeah, this is not bad on
> WebLogic..." e.g.

'meaningfull' is the key word in such a context.

> If anyone was thinking about it, why not discuss it. If not, then I'll
> forget it for a while.

There are a few things that need to be seriously benchmarked:

 1) compiled generators vs interpreted generators
 2) compiled transformers vs. interpreted transformers
 3) speed of serialization
 4) speed of multiple xslt transformation compared to a single stylesheet

In short, it's very hard to tell where the hotspot of your pipeline is
and the profiler information doesn't give you much info given the fact
that event-driven processing can't clearly separate processing stages.

One thing can be done, though: incremental profiling thru pipeline
dissection.

For example, consider a pipeline composed or

 g ---> t1 ---> t2 ---> t3 ----> s

the *real* timing is taken if we remove the rest of the pipeline and
measure with 5 different requests the time taken... so

 request #1) time taken to execute g --->
 request #2) time taken to execute g --->  t1 ---->
 and so on, until the end

this is the only meaningful way to show where the time is spent. We can
also add

 request #0) time taken to execute ""

nothing, which is the time taken by cocoon not to execute anything, but
just to setup the environments, crawl the sitemap/flowscript and reach
that point.

Even more meaningful would be to do something like

 request 0 -> nothing
 request 1 -> g -->
 ...
 request n+2 -> g --> t1 ... tn ---> s
 request n+3 -> --> t1 ... tn ---> s
 ...
 request 2n+6 -> ---> s

this could well be automated at night and present you in the morning
with a well detailed plan of action on *where* your pipelines are
*really* spending time.

The problem with the above is that such a tool needs to be stateful and
have some jmeter-like features of record logs and playback sessions or
it wouldn't be useful in a serious web-app environment.

Anyway, food for thought.

-- 
Stefano.



Mime
View raw message