cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Unico Hommes" <Un...@hippo.nl>
Subject RE: [RT] ComponentizedProcessor (was RE: Migrating TreeProcessor to Fortress)
Date Wed, 12 Nov 2003 10:04:43 GMT


> 
> -----Original Message-----
> From: Sylvain Wallez [mailto:sylvain@apache.org] 
> Sent: dinsdag 11 november 2003 21:48
> To: dev@cocoon.apache.org
> 
> Hi all,
> 
> Here's a RT about Unico's proposal of "flattening" the 
> sitemap for the migration to Fortress. Please read carefully, 
> this has a lot of implications.
> 
> 
> Introduction
> ------------
> Today isn't worked in France. We "celebrate" (should we enjoy 
> of that?) the end or Word War I, and this is the occasion to 
> explain children what their grand-grand-fathers went through 
> a century ago, hoping this won't happen again. I was doing 
> some DIY at home, and manual work freezes my brain. So while 
> digging in the garden, I was thinking of Unico's "iconoclast" 
> proposal about the sitemap engine. Yes, the treeprocessor is 
> still somehow "my baby", and seeing it shaked as it is these 
> days makes me think a lot about it.
> 
> And then came the sudden revelation: Unico's idea is 
> brilliant and its implications go far beyond the migration to 
> Fortress.
> 
> 
> Implications
> ------------
> Considering every sitemap statement as a component allows to 
> very easily 
> implement a number of features that are either were wanted 
> for long but 
> were never implemented because of their complexity, or that will be 
> needed for blocks:
> 
> 1/ Virtual components
> Virtual components are sitemap snippets that can be used in place of 
> "regular" components. I many languages, these are called 
> "macros". With 
> sitemap statements as components, virtual components are a breeze to 
> implement: just lookup the component, and see if what's returned is a 
> regular sitemap component (e.g. a Serializer) or if it's a 
> ProcessingNode. If it's a regular sitemap component, add it to the 
> pipeline, and otherwise invoke the ProcessingNode.
> 
> What I'm not sure about here, is if its possible (or even desirable) 
> that we can have two different implementation interfaces for 
> a single role.
> 

The problem with Fortress here is that it forces the role to be the
implementation interface. It is also due to the way Fortress handles
meta data.

Can't virtual components just implement their respective pipeline
component interfaces: Transformer, Generator, Serializer? This way we'd
treat them just as regular pipeline components.

> 2/ Resources inheritance
> Resources are nothing more than untyped virtual components (yeah 
> Stefano, I know, they should be serializers). So if a resource isn't 
> defined in a sitemap, we go up to the parent sitemap's 
> component manager 
> and lookup the resource there.
> 
> 3/ Block-defined sitemap components
> A block can provide sitemap (and other) components to other blocks, 
> including virtual components. Nothing special here actually, but the 
> fact that block inheritance is implemented, once again, by the parent 
> relationship of component managers.
> 
> 3/ View inheritance
> Views are nothing more than virtual serializers, with the main 
> difference that their hint is defined at runtime by the "cocoon-view" 
> parameter. And since these are components, lookup goes up to 
> the parent 
> sitemap if a view is not declared in a given sitemap, thus providing 
> inheritance.
> 

Cool.

> 
> Side note: relative URIs
> ------------------------
> The various considerations about inheritance above leads to 
> the question 
> of resolution of relative source URI (Carsten raised this issue some 
> time ago): what is the base URI that should be used by the resolver?
> 
> My opinion is that the base URI should be the one of the sitemap 
> _handling_ the request. This means that "jumping" to another sitemap 
> through virtual components or view inheritance should not affect the 
> base URI.
> 
> However, there are many situations where we want to use a source 
> relative to the _current_ sitemap regardless on how it's called. For 
> this, I propose a new protocol similar to how "context:" behaves with 
> the root sitemap, but for non-root sitemaps. The "sitemap:" protocol 
> comes to mind, but I'm not sure this is a good name.
> 

Wild idea: context:/ identifies the current context, context://
identifies the root sitemap? Like in cocoon: protocol?

> 
> Performance considerations
> --------------------------
> When writing the TreeProcessor, great care was taken to pre-analyse 
> everything that is possible to achieve maximum runtime speed. I 
> currently found only two performance degradation points with this new 
> approach:
> 
> - it's not possible to choose the ProcessingNode implementation 
> depending on the class of a component as, e.g. in 
> MatchNodeBuilder. The 
> cost is finally just an "instanceof" check to choose the 
> right behaviour.
> 

Are you talking about Matcher/PrepareableMatcher pair? Not sure why
polimorphism should break here?

> - mapping from view names to their labels is pre-computed in the 
> TreeProcessor for each individual sitemap component, so that 
> the view's 
> ProcessingNode (if any) can be found directly with the view name (see 
> SitemapLanguage.getViewsForStatement and e.g.GenerateNode.invoke()). 
> But, considering that views are marginally used in a production 
> environment, the few extra lookups can be considered as negligible.
> 
> 
> Implementation
> --------------
> The implementation mainly consists in merging the code of 
> ProcessingNodeBuilder classes in the corresponding 
> ProcessingNode class.
> 
> The initial "flattening" transformation can be implemented in 
> XSL, whose 
> simplicity will allow to implement at this level some semantic checks 
> that can be difficult to implement otherwise.
> 

Agreed.

> However, an important requirement is to keep location information of 
> sitemap statements. For this I suggest to augment the sitemap 
> SAX stream 
> by adding Locator information in a "location" attribute on every 
> element. This augmentation can be useful in several other 
> contexts such 
> as Woody (would avoid the dependency on Xerces in 
> DomUil.LocationTrackingDOMParser). This way, the initial location 
> information can survive any kind of transformation.
> 

Yes, we need to do this.

>  From a security and abuse point of view, I'm wondering if 
> all sitemap 
> statement components should be made visible to other 
> components through 
> the container. If we don't want this, the sitemap engine 
> could consist 
> of two component managers, one containing the "public" 
> statements such 
> as views, resources, virtual components and the contents of 
> <map:component>, and a child "private" manager containing 
> other sitemap 
> statements. This may also allow the public container to be 
> less loaded 
> and therefore faster.
> 

OK.

> 
> Conclusion
> ----------
> This new approach seems to have very few drawbacks (hope I 
> did not miss 
> something important), and will lead to a dramatic 
> simplification of the 
> sitemap engine. The most noticeable one being that the number 
> of classes 
> will be divided by 2.
> 

Cool I am glad you say this. I was starting to think I was just shooting
my mouth off. (Which off course I was but somehow turned out alright ;-)

> There's only one implication on Cocoon's core: the ProcessingNode 
> interface is now a public contract between processors, since this is 
> what all these components implement.
> 
> The only criticism (yes, there need to be some ;-) is that I 
> took great 
> care in the TreeProcessor to separate build-time code and 
> run-time code, 
> while the ComponentizedProcessor will merge them in a single 
> class. This 
> allows all build-time data structures to be garbage 
> collected, since we 
> will never need them again. I also had the secret hope to be able to 
> serialize the processing tree, in order to be able to use a pre-built 
> tree on small devices (remember, I run Cocoon in small 
> places), but this 
> proved to be difficult if not impossible because components 
> have a lot 
> of relations with non-serializable objects.
> 
> I'm wondering if we should write this new sitemap engine in the 2.2 
> branch or if it should go in the 2.1. Fortress isn't a requirement to 
> implement this, and it will allow us to provide views and resource 
> inheritance before the 2.2 is out.
> 

I agree with Carsten that we should develop it in 2.2 and see later if
we can port it to ECM so it's useable from 2.1 as well.

> And I also think we should consider this approach when 
> migrating Woody 
> to CocoonForms, since Woody uses the same mechanism than the 
> TreeProcessor to build a widget definition trees.
> 
> Thanks again Unico for this brillant idea.
> 

Actually, Sylvain, I wasn't trying to solve all the things you said this
idea now solves. Nope, don't blame me ;-)

Unico



Mime
View raw message