cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <>
Subject Re: [FYI] How TreeProcessor Works
Date Fri, 24 Oct 2003 14:55:05 GMT
Berin Loritsch wrote:

> TreeProcessor is a complicated beast, so examining the classes does 
> not lend any clues to what is going on.  However, the key to 
> understanding TreeProcessor is the treeprocessor-builtins.xml file. 

?? Haven't you seen my explanation to your previous request?


> We have an XML document with the following DTD:


> So with a mock XML slimmed down to just the simplest state:
> <tree-processor>
>   <language name="sitemap"
> class="org.apache.cocoon.components.treeprocessor.sitemap.SitemapLanguage" 
>       pool-min="1" pool-max="1">
>     <namespace uri=""/>
>     <file name="sitemap.xmap"/>
>     <parameter element="parameter"/>
>     <!-- roles skipped because they are irrelevant -->
>     <nodes>
>       <node name="pipelines" 
> builder="org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNodeBuilder">

>         <allowed-children>pipeline, handle-errors</allowed-children>
>         <ignored-children>component-configurations</ignored-children>
>         <forbidden-children>sitemap, components, 
> pipelines</forbidden-children>
>       </node>
>     </nodes>
>   </language>
> </tree-processor>
> What is happening here is that we define a sitemap tree parser by 
> first identifying how to recognize the sitemap: the namespace for the 
> XML, the default file name, how to recognize the "parameter" element 
> (special to TreeProcessor semantics).  I skipped the roles definition 
> because in Cocoon 2.2 it won't be needed.  However, it describes the 
> default types of components that the tree processor expects.


> The Nodes section is the heart of the system.  It maps XML elements to 
> Builder objects which perform some sort of logic.  The child elements 
> "allowed-children", "ignored-children", and "forbidden-children" act 
> as a "poor man's" DTD so to speak.  At least they provide some 
> explicit processing hints that augment a DTD.  In the example above, 
> the "pipeline" and "handle-errors" are child nodes that are explicitly 
> allowed and handled from inside the "pipelines" node.  The 
> "component-configurations" node is allowed to exist as a child of the 
> "pipelines" node, but no processing occurs.  Lastly, the 
> "forbidden-children" element identifies nodes that cannot exist as a 
> child of the "pipelines" node. 


> All the enumerated elements (enumerated by a comma and any amount of 
> whitespace) must be declared nodes so that they can be processed.
> In theory, XSP pages *could* be implemented with the TreeBuilder, but 
> in practice, you cannot predict the schemas used for elements other  
> than the XSP specific ones.  The TreeProcessor is best suited for 
> fully encapsulated schemas that act as a sort of language like the 
> Sitemap.

XSP also has the particularity of allowing embedded java code, meaning 
it requires the production of java code and thus cannot be implemented 
with a tree-evaluation based approach.

> This at least is the base theory behind the TreeProcessor--as far as I 
> can tell.  Please let me know if I am missing it somewhere. 

You're totally right!

> As to implementation, the TreeBuilder creates a heirarchy of ECM 
> implementations that add any necessary components and Builder 
> components. The particularly troublesome portion of this is the use of 
> the Recomposeable interface.
> The whole issue with the Recomposable interface as it is written here 
> is that the child and parent component managers are constantly 
> overwriting each other. THis is a serious conflict, and it will break 
> as soon as we proxy components. The proxied components hide any 
> lifecycle interfaces so that no rogue client can usurp the component 
> manager, or any other part of the lifecycle of a component, and 
> provide for a more stable system.

The Recomposable interface is used here so that node builders know the 
component manager of the tree that is being built, because this is where 
the builders should lookup components when they need some.

I admit this is not clean, as it mixes the container which manages the 
node builders (built with the treeprocessor-builtins.xml file) and the 
container in which the tree that is being built has to live.

A solution can be to add a getTreeManager() method to the TreeBuilder 
interface, that would return the manager for the tree being built (i.e. 
the one defined by <map:components>).

How does it sound?

> THe recomposable calls scare me because they look like something that 
> would work under low load, but would break down under high load.  With 
> something like Cocoon that is a big issue.  I don't have any numbers 
> to show everyone, but it is just a feeling I get by looking at the code.

You should not wonder, since this is used only to _build_ the sitemap, 
i.e. at startup or when the sitemap file is changed.

> As to the nitty gritty details of how the node tree is built and run, 
> I am still somewhat fuzzy on the details.  I know we have a bunch of 
> NodeBuilders, which instantiate the Nodes, which in turn are special 
> components.  The NodeBuilders can be viewed as a sort of intelligent 
> object creator, but I am not sure whether the Nodes are components 
> with relaxed requirements on the constructor, or if the Nodes are 
> simple objects.  Those Nodes are what does the hard work.  Once the 
> tree is built, the builders are not necessary any more (unless you 
> want to keep building new trees).

Please refer to my previous post mentioned above. Your analysis is 
right, and Nodes are inbetween components and simple objects: the 
DefaultTreeBuilder.setupNode() method will honor any lifecycle interface 
implemented by a node, and if the node implements Disposable, it is 
added to a list that is used to dispose a processing tree when needed 
(system shutdown or sitemap reload).

I came to this NodeBuilder/Node pattern since Nodes need to behave like 
components but cannot be declared as such, as the configuration of a 
given node type (e.g. GenerateNode) highly depends on its environment 
(i.e. the corresponding markup in the sitemap file). Moreover, a single 
NodeBuilder implementation can produce nodes of different classes, also 
depending on the environment (see ActNodeBuilder or CallNodeBuilder).

> I know I want to have a new Container per sitemap, but I think I need 
> some help in mapping it to this problem space.  Ovideu, do you think 
> you could at least spare some guidance? 

Ahem... I guess Ovidiu won't isn't the right person for this stuff, but 
I hope my explanation will help ;-)


Sylvain Wallez                                  Anyware Technologies 
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -

View raw message