cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Berin Loritsch <>
Subject Re: RT - Tree Traversal Implementation of Sitemaps and XSP
Date Fri, 05 Oct 2001 13:52:45 GMT
Stefano Mazzocchi wrote:

> > In combination with the Strategy pattern from the GOF, I
> > got the following crazy idea:
> >
> > Why not parse the sitemap and create an in-memory representation
> > (DOM,JDOM,???) where each node references both a component (Reader,
> > Transformer, etc.) and a traversal strategy.  When a request comes in:
> >
> > 1. Create the appropriate object encapsulating the parameters, etc.
> > 2. Request a thread from a request handling thread pool
> > 3. Pass the thread the request object and the root of the sitemap tree
> >
> > Traversal of the tree basically consists of:
> >
> > 1. Request the strategy for the current node
> > 2. Ask the strategy to do its thing
> >   2a. Which in most cases is likely to be a pre-order traversal, I think
> >
> > For some reason the conceptual model of a bunch of request objects
> > traversing the sitemap tree seems a lot clearer than the current approach.
> Oh, most definately.

It also allows for an easier way to implement heirarchy lists.  You know how
on some sites (like directory sites) you have a string of links accross the
top looking like this:

Home > Projects > Cocoon > Installation

I have been trying to come up with a good way for Cocoon to automatically create
this, and haven't come up with anything elegant.  Tree traversal would be a
perfect match.

> > Is this a good idea?  I really don't know.
> I'll tell you one thing: don't know if you had the patience to read thru
> all the recent RT between Berin and I, but we both indicated that the
> current technique to compile the sitemap is slow, hard to understand and
> very impractical at least for Cocoon development.
> The only place were sitemap compilation might have a place is for
> binary-only cocoon webapps that need to be deployed in production, but
> this is yet to be seen.

I think there are a couple live C2 apps.  If the sitemap doesn't change, or
get touched, the compiled class is used.  It is not an explicit contract,
but it does happen.

> > It does everything the current
> > approach does (I think), feels cleaner (to me) , and might make debugging
> > easier (ask the request to report on the path through the tree that it
> > took).  Without trying I can't tell if it would be faster or slower, or what
> > the RAM consumption would be.
> For sure, the memory consumption would be worse for production
> enviornments where sitemaps never change and better for development
> environments where sitemaps continously change.

Servers usually have copious amounts of RAM... ;)
Seriously, several projects (like Squid proxy server) use a compiled BTREE
implementation during run time.  It assembles the BTREE in RAM, and serializes
it to the disk so it can read it directly back in later.  BTREEs are supposed
to be one of the fastest run-time approaches for tree traversals but do it
at the price of extensive use of RAM.

> In theory, (if we are lucky) we might find out that such a code might
> even be faster than if written directly in native code: in fact, such a
> continous and dynamically adapting JIT polishing of the native code,
> might turn out to perform real-time optimization that adapt on the type
> of load, something that would be much harder to do if we were to rewrite
> Cocoon in native code to make it faster.

It very well might be quicker.  I have an idea that is not tree traversal,
but it would work equally well.  The short description is this:

All pipelines are pre-configured and treated as proper components.  As a
URI request comes in, it is fed to the Sitemap Pipeline Selector.  This
approach takes advantage of pooling, and as specific pipelines are used,
the pool keeps enough in memory.  Thus, the pipelines that are used less
often don't have as many instances.

In the end, the structure of a pipeline is configurable--encapsulating
sub-sitemaps and all the other logic.  With the pipeline centric approach,
the lookup is done via a HashMap scheme which has proven to be efficient
during runtime.  The maximum number of lookups is directly related to
the number of sub sitemaps.

This proposal combined with Cocoon Apps (with the URIMap, FlowMap, and
named Cocoon components) would be very effective.  The shift in focus
of the URIMap vs. the Sitemap automatically gets you thinking what will
be the least amount of work to achieve the desired URI space.  The URI
space will be immediately known--and 404 errors can be even more quickly

As it is currently implemented, the Sitemap gets the implementor thinking
linearly instead of spacially.  In other words the view of the Sitemap
administrator is a straight line of if/then/else statements as opposed
to a two dimensional map of URI space and resources.

> > It really seems to make debugging a pain, as well as placing a
> > dependency on having a java compiler on the runtime system.
> Absolutely, expecially given javac which is written only for command
> line and has no stream in/out capabilities and is not redistributable.

BCEL (recently proposed to join the apache fold) would take care of that.
For those of you who don't know what BCEL is, it is a byte code engineering
layer that you can implement your own parsers and have valid java byte code
come out.

It does not erase the issue of debugging, however.  One thing that can
more easily be done with implementing your own parser is embedding locator
logic so that your exceptions report line numbers in the Sitemap and XSP

> > I haven't been
> > trying too hard to figure out what the advantages to this approach are, but
> > I haven't come up with any.
> As I said (but I have no figures to prove this), I believe current
> approach is a winner for JDK 1.1 and 1.2, while is very likely to fail
> on all issues but memory consumption on JDK 1.3 and 1.4 (and very
> likely, all future versions)

We have already had some evidence of this.  There are some serious resource/memory
leaks involved in dynamically compiled resources that Cocoon developers have
tried to work around.  I don't see it getting any better real soon.

To unsubscribe, e-mail:
For additional commands, email:

View raw message