cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: [RT] Alternatives to sitemap
Date Sun, 08 Jul 2001 10:30:12 GMT
> I have been thinking about the whole sitemap approach in Cocoon 2.
> It is the point that has the biggest learning curve, and it is a
> single point of failure.  The following email on the Turbine mail
> list prompted me to voice this oppinion now instead of later.  After
> the email, I want to propose an alternative solution.

-------- Original Message --------
Subject: Re: turbine vs. struts
Date: Thu, 05 Jul 2001 18:30:49 -0700
From: Jon Stevens <>
To: Turbine-user <>

on 7/5/01 6:26 PM, "John McNally" <> wrote:

> So maybe I am wrong and there
> aren't that many people who prefer "mapping spec" approach.
> john mcnally

Personally, I don't see the point and I think it is a bad design. It is
single point of failure...

If you screw up your mapping .xml file on your live site, your entire
is potentially broken. If you have 100 developers working on a site,
making each developer edit a single file in order to define things like
Actions and URI mapping is a terrible idea.

I think that part of Struts is a terrible design idea.


--------- End of Message ---------

Please, allow me to write something about this.

Jon is right indicating that it is a terrible design idea to force 100
developers working on a site to have to update a map each and every time
they have to do something.

This is the reason why we didn't force a sitemap to be one and only one,
but map to different parts of the site on different files.

The idea of having a map is nothing different from Apache's .htaccess
files that started as a way to indicate access and later allowed you to
do "everything" from there. We simply redesigned the concept together
with the pipeline components approach that Cocoon2 has.

> What is right with the Sitemap
> ------------------------------

> Stefano envisioned a way to manage the URL space orthagonally to 
> the filesystem. Before Cocoon, people simply expected the URL space 
> to match the filesystem.

*this* is the terrible design idea and we have to blame the first web
servers for this and also all the lazy butts that didn't want to update
such a mapping file every time they added a file.

Result: no control on the URI space. Loose contract. Broken links all
over the place and URIs that are pieces of crap.

But I'm sure people there is a bunch of people (Jon being one of them)
who greatly prefer being lazy first and do painful rewriting later,
rather than think more at first and avoid stupid mistakes later.

It's a matter of life phylosophy, I guess.

> Cocoon has several pieces to a generated result, and therefore this 
> simple approach really won't work.  The Sitemap enforces the contract 
> of URLs by allowing a filesystem to be reorganized independantly of 
> the URL space.  This is a good thing.

I'm more and more convinced of this myself the more I research on the
topic. (you'll read why on my thesis next week).

> What is wrong with the Sitemap
> ------------------------------

> I have already voiced the opinion that the sitemap mixes too many 
> concerns (component type declarations, etc.).  

Here I totally agree.

SoC is a pretty hard thing to do, expecially when you are pioneering new

> There is also a problem with it being a single point of failure.  

Here, I disagree.

You are calling a contract a "single point of failure". Is it? Of
course, it must be. A solid contract allows two separate entities to
work together.

Would you consider the Servlet API a point of failure because, if
changed with no back compatibility, all the servlets written will be

It's obvious: a protocol, a spec, a map are all contracts and must be
changed wisely.

The serious issue here (besides implementation problems I already
explained) is who is in control of the sitemap: just like you wouldn't
allow servlet users to change the servlet API as they need, the sitemap
should be handled by a special team which is responsible to design the
site and manage it.

The three original concern islands (content, logic, style) become four

           /    |    \
      content logic style

and there must be no contracts (so no direct interation) between the
three underlying groups, but all should pass thru the design area that
is in charge of controlling and managing the contracts (thus the maps).

> Just look at all the messages on Cocoon Users with the 
> "The sitemap handler's sitemap is not available" errors.  
> If the sitemap does not compile correctly, the whole site is dead.

This is an implementation issue. With interpretation, you would be able
to isolate mistakes and avoid performing a single URI mapping, perhaps,
but without killing the entire site.

Compilation is the point of failure here, not the sitemap design.

> Also there is little security enforced, and the ability to extend 
> the file mapping outside the Servlet's Context.  These are bad things.

Careful, you are misinterpreting design limitations for design faults.

Jon is against mapping. It reminds me of my early choices that lead to
Cocoon1 reactor pattern and PI chains. Do you want to go back to that
spaghetti mess?

Well, this is where Turbine is heading.

*this* is a terrible design mistake and we know that from experience.

> It should _never_ be Cocoon's responsibility to mimic all the things 
> that Apache HTTPD can do.

Sure, but sometimes you can't avoid that, expecially if Apache
architecture is not even close to be powerful enough to support our
technological requirements (think of SAX pipelining between Apache
modules, it will take decades to happen).

> Allowing any kind of access outside of the Servlet spec approved areas
> (the context directory and the repository) is a violation of security 
> constraints and portability requirements in the servlet spec.

I agree, this is why we need a CMS, unfortunately none is powerful
enough for our needs.

> The sitemap is the most complex piece in the entire Cocoon system, 
> and as a result, it is difficult for new users to comprehend it.  

True, but when they do, they understand the entire thing, unlike
previously with Cocoon1 which was "magic art" all the way to the metal,
even for expert users.

> I have had three developers try to use Cocoon, and they look at 
> the sitemap and freeze.

Graphic artists look at XSLT and freeze.

Content writers look at XML and freeze.

Java programmers look at Perl code and freeze.

Emacs users look at VI and freeze.

See the pattern? it's not the technology that is wrong, it's the
intrinsic difficulty in twisting mindsets. 

Moreover, programmers used at procedural programming have a hard time
understanding declarative languages and a sitemap is declarative.

> They spend too much time trying to understand the Sitemap, 
> and not enough time trying to solve problems.  
> In a development environment, this is not acceptable. 

You blame the sitemap for this. I would rather blame your enviornment's

If you people wanted to take your existing organization, place Cocoon in
the middle, and see all the problems magically disappear, well, sorry
but I still have to see a software architecture that allows this.

> It is very frustrating because any time I tell them "I have set up 
> the Sitemap for you, ignore it for now", I find that they are 
> still obsessing with it.  

Yes, I experienced this myself. But this "obsession" was created by the
lack of confidence in the work organization, lack of "concern coverage".
A sort of mismanagement that is percieved as bad by people who feel they
lack solidity in their contracts with their environment.

> The Sitemap as it stands is *too* powerful, and my developers 
> are tempted to try to use it to solve their problems.

Correct. Just like they'd rather write code into markup even if this
makes it a pain in the butt to manage later on.

Programmers are lazy butts. We all are. But we know from experience (we
wouldn't be using Java after all, nor be Avalon lovers, in fact) that
thinking first saves you time later.

Developers should *NOT* be granted access to the sitemap.


Neither, graphic artists nor content writers nor accounting managers nor
system administrators. The sitemap is in control of who designed the

> Lastly, in practice, there are a few actual pipelines 
> (generator/transformer/serializer) for each site.

I had the same experiences.

> In fact, I have one pipeline for _all_ my html code in my webapps.
> The things that differ are the Actions used in conjunction with it, or the 
> type of generator I use.  

Yes, this will be very common.

> Another side affect with the Sitemap is the existence of Readers.

Agreed. Readers are redundant. If Cocoon was implemented as an Apache
module (thus with native connection to the web server) we wouldn't have
need for this.

But this is an implementation detail.

> It is my belief that anything simply read from a filesystem should 
> be handled by HTTP daemons like Apache, TUX, or whatever you use.  
> They are better optimized for it, and it reduces the load on the JVM.  

Agreed. In fact, as soon as Apache 2.0 is final (and a JVM can be placed
inside the server process, being multithreaded) I'd plan to consider
making a more apache-specific interface to Apache's internal machinery
to improve performance.

But again, this is an implementation detail and doesn't impact the
design of the sitemap.

> We still need Readers for resources that cannot be reached via the 
> filesystem (i.e. the DatabaseReader).

yes, but again, apache is better suited for this as well.

> What should we do?
> ------------------

> We should persue Stefano's FlowMap idea, as well as use more 
> formalized definitions of a pipeline.  


> For the sake of our discussion, a pipeline will be considered a 
> generator, a list of transformers, and a serializer.  We will 
> ignore resources, views, and readers for the time being.  
> In practice, there are fewer pipelines than URLs much like there
> are fewer stylesheets than XML sources.  
> We need to define what they are, and how to map URLs to the 
> pipeline.  I already hear the chorus of people saying, 
> "Isn't that the sitemap?".  Hear me out, there is a much simpler 
> way of declaring these things.  It also leverages some approaches 
> that Avalon's Component Manager allows and aren't used in Cocoon.  
> Check out the following syntax:

<pipelines default="file2html">
  <pipeline id="file2html">
    <generator type="file" source="${source}.xml"/>
    <transformer type="xslt" source="document2site.xsl"/>
    <transformer type="xslt" source="site-${theme}.xsl"/>
    <serialize type="html"/>
  <pipeline id="xsp2html" extends="file2html">
    <generator type="serverpages" source="${source}.xml"/>
  <!-- ... continued ... -->

> What is so special about this?  Aside from now having a list of 
> pipelines that we can use for flow maps and url maps, we have the 
> ${variable} construct.  So far this is not revolutionary.  
> What is new is the introduction of a FilteredContext that extends 
> Avalon's Context object.  This FilteredContext will have the 
> following methods:

interface FilteredContext extends Context {
     * Add a filter to the Context object
    void addFilter(PipelineFilter filter);

     * Sets the Object Model that the filters can use.
    void setObjectModel(Map objectModel);

> The Pipelines will extend the Recontextualize interface, and for each 
> request, they are fed a Context object that corresponds to a Flow map 
> or URL map.  When the pipeline is executed, the "source" parameter of 
> the SitemapComponents is populated from the FilterContext.
> The code would look like this:

generator.setup(resolver, objectModel, context.get("${source}.xml"),

> The FilterContext uses the internal filters to translate the source 
> parameter into the actual filename.  Filters would be defined 
> in this manner:

<filters default="url-match">
  <filter id="url-match" defines="source"/>
  <filter id="parameter" defines="theme"/>

  <mount prefix="process/" flowmap="context://process/flowmap.xmap"/>
  <alias suffix=".html" pipeline="file2html">
    <apply-filter name="url-match">
      <parameter name="doc-root" value="context://docs"/>
    <apply-filter name="parameter">
      <parameter name="source" value="session"/>
      <parameter name="type" value="attribute"/>
      <parameter name="name" value="theme"/>

<flow-map protected="true" permit-roles="admin,user,manager">
  <resource-pipeline suffix=".html" pipeline="xsp2html">
    <apply-filter name="url-match">
      <parameter name="doc-root" value="context://process"/>
    <apply-filter name="parameter">
      <parameter name="source" value="constant"/>
      <parameter name="name" value="theme"/>
      <parameter name="value" value="default"/>

  <resource id="header" handler="process-header"/>
  <resource id="line-item" handler="process-lineitem"/>
  <resource id="confirmation" handler="process-confirm"/>
  <resource id="no-permission" handler="forward"/>

  <flow start="header" access-denied="no-permission">
    <entry resource="header" next="line-item"/>
    <entry resource="line-item">
      <choice parameter="destination" default="end">
        <value="home" next="header"/>
        <value="next" next="line-item"/>
        <value="end" next="confirmation"/>
    <entry resource="confirmation" exit="../index.html"/>
    <entry resource="no-permission" exit="../index.html"/>

> Now, let's talk about how this all works together.  We have a 
> default URL-MAP that handles URI mapping, and takes care of 
> mounting the flow maps.  The filters and pipelines are simply 
> resources that are used in the map files--they can and should
> be contained in separate files.  

I like this concept of separating pipelines definitions in different
files, sounds like a better componentization approach and for sure
improves SoC in the design area as well, since pipeline componentization
is more programming oriented than creating a flow.

Hmmmm, sounds very promising.

> The pipeline is chosen by simple suffix matching, and pipelines can 
> extend other pipelines.  The important thing to notice is that
> the Filters take care of the magic of pulling information from the 
> objectModel, and populating the variables in the source parameters.  
> This is easily comprehended, and very powerful.  This means that with 
> the proper planning, you can get away with very few pipelines.

> The URL Map first checks to see if the URL matches the mounted flowmap.
> If not, it falls through to the alias for ".html" URLs.  Notice the 
> name "alias", as it properly reflects what is going on here.  Also note 
> that there is a default pipeline.  In the absense of and URL-Maps or 
> Flow-Maps, the request will follow that pipeline's instructions.  
> What about the variables?  That is something that requires some thought.
> We could declare the filters in the pipelines, as they would now work 
> automatically. We also could provide reasonable defaults.

> The Flow Map is a bit different.  It configures the pipeline and 
> filters for all the resources--this is a development speed savings.
>  After all, all the resources in a form are going to remain the same. 
> You will also notice the attributes of the Flowmap.
> Many forms are only allowed to people with the proper roles.  
> That is why the "protected" attribute and the "permit-roles" are 
> present.  When a flow map is protected, we check the Request 
> "isUserInRole" method to find out if a user can access the resource.  
> After the resources are defined, we see the <flow/> entry.  
> The "start" attribute determines where normal flow starts, and the 
> "access-denied" attribute gives determines the resource to use to 
> handle when a user is not in the proper role.  Lastly, we have the 
> entries that determine where flow moves.  There are three ways of 
> determining the next action in a flow:

* The "next" attribute
* The "exit" attribute
* The "choice" element

What about "back"?

> The "next" and "exit" attributes function similarly, as they specify a 
> static destination. They differ in that the "next" attribute specifies 
> a resource and the "exit" attribute specifies a URL.

Why so? (just curious)

> The "choice" element allows you to specify a Request parameter to 
> inspect for a selection of destinations.  You must provide a default 
> value so that the flow is never broken.  The default is chosen if the 
> parameter specified does not exist or does not contain any of the
> specified values.

Great ideas, Berin. Great thinkgs to work on. I'll digest them a little,
play with it and try to come up with some comments in the near future.

> ------------

> This whole solution can be put together without compiling the resources.  
> In fact, I would much rather it be done that way.  By depending on a 
> dynamically compiled resource we increase the risk that our site will 
> fail.

Totally agreed.

Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<>                             Friedrich Nietzsche

To unsubscribe, e-mail:
For additional commands, email:

View raw message