cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From giacomo <>
Subject Re: [RT] Alternatives to sitemap
Date Wed, 11 Jul 2001 20:13:24 GMT
On Wed, 11 Jul 2001, giacomo wrote:

> I have been thinking about the whole sitemap approach in Cocoon 2.
> It is the point that has the biggest learning curve, and it is a
> single point of failure.  The following email on the Turbine mail
> list prompted me to voice this oppinion now instead of later.  After
> the email, I want to propose an alternative solution.

You find single points of failure in most every applications that needs
to be configured. Look a the Apache HTTPD and its httpd.conf file. Did
you ever messed it up and tried to restart it?

This is one reason why we have sub-sitemaps. Build one which only mounts
the others and construct your URI space in a hierachical way. Of course
the root sitemap is the most sensitive one but if you can manage it that
way you'll gain a very stable design of your URI space with independant
segments, right?

> -------- Original Message --------
> Subject: Re: turbine vs. struts
> Date: Thu, 05 Jul 2001 18:30:49 -0700
> From: Jon Stevens <>
> Reply-To:
> To: Turbine-user <>
> on 7/5/01 6:26 PM, "John McNally" <> wrote:
> > So maybe I am wrong and there
> > aren't that many people who prefer "mapping spec" approach.
> >
> > john mcnally
> Personally, I don't see the point and I think it is a bad design. It is
> a
> single point of failure...
> If you screw up your mapping .xml file on your live site, your entire
> site
> is potentially broken. If you have 100 developers working on a site,
> then
> making each developer edit a single file in order to define things like
> Actions and URI mapping is a terrible idea.

Cocoon 2 does not follow this pattern. For development team you need to
have ONE person responsible for a sitemap and thus for the design of the
app/site they are developing. Itegration into a production environment
should go the usual staged way.

> I think that part of Struts is a terrible design idea.
> -jon
> --------- End of Message ---------
> What is right with the Sitemap
> ------------------------------
> Stefano envisioned a way to manage the URL space orthagonally to the
> filesystem.
> Before Cocoon, people simply expected the URL space to match the
> filesystem.  Cocoon
> has several pieces to a generated result, and therefore this simple
> approach really
> won't work.  The Sitemap enforces the contract of URLs by allowing a
> filesystem to
> be reorganized independantly of the URL space.  This is a good thing.
> What is wrong with the Sitemap
> ------------------------------
> I have already voiced the opinion that the sitemap mixes too many
> concerns (component
> type declarations, etc.).

Well, I think I've stated this before. I'd like to see the component
definition move from sitemap.xmap into cocoon.xconf rather sooner that
later. This will narrow the sitemap.xsl by quite a bid of code.

> There is also a problem with it being a
> single point of
> failure.

Again, this can easily be reduced by sub sitemaps. But there is almost
always a sigle point of failure in every system.

> Just look at all the messages on Cocoon Users with the "The
> sitemap handler's
> sitemap is not available" errors.  If the sitemap does not compile
> correctly, the whole
> site is dead.

No. Only the part of the URI space the mentioned sitemap is mapped to. I
have to add here that the compilation of the sitemap was an experiment
I've made long, long ago. Suddenly it was THAT piece of C2. I really
like to see someone build an iterpreted version to see the differences.

> Also there is little security enforced, and the ability
> to extend the
> file mapping outside the Servlet's Context.  These are bad things.  It
> should _never_
> be Cocoon's responsibility to mimic all the things that Apache HTTPD can
> do.  Allowing
> any kind of access outside of the Servlet spec approved areas (the
> context directory
> and the repository) is a violation of security constraints and
> portability requirements
> in the servlet spec.

Also, if you read the first draft of the sitemap (xdocs/draft/...)
you'll see why this is possible. The initial design was that you can
define your context at will for a particular (sub)sitemap URI space
combination. That it is possible to serve a file outside the context
(ie. with the FileGenerator) is simply lazyness of the implementation.
This has nothing to do with servlet spec or whatever. And have in mind
that Cocoon2 is not a Servlet (it is only embeded into a servlet
environment as one of the major implemented environments)

> The sitemap is the most complex piece in the entire Cocoon system, and
> as a result, it
> is difficult for new users to comprehend it.  I have had three
> developers try to use
> Cocoon, and they look at the sitemap and freeze.  They spend too much
> time trying to
> understand the Sitemap, and not enough time trying to solve problems.
> In a development
> environment, this is not acceptable.  It is very frustrating because any
> time I tell them
> "I have set up the Sitemap for you, ignore it for now", I find that they
> are still obsessing
> with it.  The Sitemap as it stands is *too* powerful, and my developers
> are tempted to
> try to use it to solve their problems.

I don't know the environment/organisation you use in the development
cycle you are in but what you state is reality for most teams I've
worked with. If you build a development team for a application you need
to develop you almost always have the need for a db guru, a design guru,
a system administrator a java guru and also average skilled people to do
basic works so why not a sitemap guru (SoC?). Of course the smaller the
project/team is the more skilles will be unified on a single person.

> Lastly, in practice, there are a few actual pipelines
> (generator/transformer/serializer)
> for each site.  In fact, I have one pipeline for _all_ my html code in
> my webapps.  The
> things that differ are the Actions used in conjunction with it, or the
> type of generator
> I use.

Ah, so it isn't that hard to write the sitemap for it, isn't it? ;)

> Another side affect with the Sitemap is the existence of
> Readers.  It is my belief
> that anything simply read from a filesystem should be handled by HTTP
> daemons like Apache,
> TUX, or whatever you use.  They are better optimized for it, and it
> reduces the load on
> the JVM.  We still need Readers for resources that cannot be reached via
> the filesystem
> (i.e. the DatabaseReader).

Nothing prevents you to do so. But it is way easier in the development
phase to have Tomcat only which is easy to setup on almost every
platform instead of additionally setup an Apache httpd on the developers
machine. At deployment time into a production environment you can always
grab these URLs which should be hadled by an traditional httpd instead
of passing it to the servlet engine and Cocoon.

> What should we do?
> ------------------
> We should persue Stefano's FlowMap idea, as well as use more formalized
> definitions of
> a pipeline.  For the sake of our discussion, a pipeline will be
> considered a generator,
> a list of transformers, and a serializer.  We will ignore resources,
> views, and readers
> for the time being.  In practice, there are fewer pipelines than URLs
> much like there
> are fewer stylesheets than XML sources.  We need to define what they
> are, and how to
> map URLs to the pipeline.  I already hear the chorus of people saying,
> "Isn't that the
> sitemap?".  Hear me out, there is a much simpler way of declaring these
> things.  It
> also leverages some approaches that Avalon's Component Manager allows
> and aren't used
> in Cocoon.  Check out the following syntax:
> <pipelines default="file2html">
>   <pipeline id="file2html">
>     <generator type="file" source="${source}.xml"/>
>     <transformer type="xslt" source="document2site.xsl"/>
>     <transformer type="xslt" source="site-${theme}.xsl"/>
>     <serialize type="html"/>
>   </pipeline>
>   <pipeline id="xsp2html" extends="file2html">
>     <generator type="serverpages" source="${source}.xml"/>
>   </pipeline>
>   <!-- ... continued ... -->
> </pipelines>
> What is so special about this?  Aside from now having a list of
> pipelines that we can
> use for flow maps and url maps, we have the ${variable} construct.  So
> far this is not
> revolutionary.  What is new is the introduction of a FilteredContext
> that extends Avalon's
> Context object.  This FilteredContext will have the following methods:
> interface FilteredContext extends Context {
>     /**
>      * Add a filter to the Context object
>      */
>     void addFilter(PipelineFilter filter);
>     /**
>      * Sets the Object Model that the filters can use.
>      */
>     void setObjectModel(Map objectModel);
> }
> The Pipelines will extend the Recontextualize interface, and for each
> request, they are
> fed a Context object that corresponds to a Flow map or URL map.  When
> the pipeline is
> executed, the "source" parameter of the SitemapComponents is populated
> from the FilterContext.
> The code would look like this:
> generator.setup(resolver, objectModel, context.get("${source}.xml"),
> parameters);
> The FilterContext uses the internal filters to translate the source
> parameter into the
> actual filename.  Filters would be defined in this manner:
> <filters default="url-match">
>   <filter id="url-match" defines="source"/>
>   <filter id="parameter" defines="theme"/>
> </filters>
> <url-map>
>   <mount prefix="process/" flowmap="context://process/flowmap.xmap"/>
>   <alias suffix=".html" pipeline="file2html">
>     <apply-filter name="url-match">
>       <parameter name="doc-root" value="context://docs"/>
>     </apply-filter>
>     <apply-filter name="parameter">
>       <parameter name="source" value="session"/>
>       <parameter name="type" value="attribute"/>
>       <parameter name="name" value="theme"/>
>     </apply-filter>
>   </alias>
> </url-map>
> <flow-map protected="true" permit-roles="admin,user,manager">
>   <resource-pipeline suffix=".html" pipeline="xsp2html">
>     <apply-filter name="url-match">
>       <parameter name="doc-root" value="context://process"/>
>     </apply-filter>
>     <apply-filter name="parameter">
>       <parameter name="source" value="constant"/>
>       <parameter name="name" value="theme"/>
>       <parameter name="value" value="default"/>
>     </apply-filter>
>   </resource-pipeline>
>   <resource id="header" handler="process-header"/>
>   <resource id="line-item" handler="process-lineitem"/>
>   <resource id="confirmation" handler="process-confirm"/>
>   <resource id="no-permission" handler="forward"/>
>   <flow start="header" access-denied="no-permission">
>     <entry resource="header" next="line-item"/>
>     <entry resource="line-item">
>       <choice parameter="destination" default="end">
>         <value="home" next="header"/>
>         <value="next" next="line-item"/>
>         <value="end" next="confirmation"/>
>       </choice>
>     </entry>
>     <entry resource="confirmation" exit="../index.html"/>
>     <entry resource="no-permission" exit="../index.html"/>
>   </flow>
> </flow-map>
> Now, let's talk about how this all works together.  We have a default
> URL-MAP that
> handles URI mapping, and takes care of mounting the flow maps.  The
> filters and
> pipelines are simply resources that are used in the map files--they can
> and should
> be contained in separate files.  The pipeline is chosen by simple suffix
> matching,
> and pipelines can extend other pipelines.  The important thing to notice
> is that
> the Filters take care of the magic of pulling information from the
> objectModel, and
> populating the variables in the source parameters.  This is easily
> comprehended, and
> very powerful.  This means that with the proper planning, you can get
> away with very
> few pipelines.
> The URL Map first checks to see if the URL matches the mounted flowmap.
> If not, it
> falls through to the alias for ".html" URLs.  Notice the name "alias",
> as it properly
> reflects what is going on here.  Also note that there is a default
> pipeline.  In the
> absense of and URL-Maps or Flow-Maps, the request will follow that
> pipeline's
> instructions.  What about the variables?  That is something that
> requires some thought.
> We could declare the filters in the pipelines, as they would now work
> automatically.
> We also could provide reasonable defaults.
> The Flow Map is a bit different.  It configures the pipeline and filters
> for all the
> resources--this is a development speed savings.  After all, all the
> resources in a
> form are going to remain the same.  You will also notice the attributes
> of the Flowmap.
> Many forms are only allowed to people with the proper roles.  That is
> why the "protected"
> attribute and the "permit-roles" are present.  When a flow map is
> protected, we check
> the Request "isUserInRole" method to find out if a user can access the
> resource.  After
> the resources are defined, we see the <flow/> entry.  The "start"
> attribute determines
> where normal flow starts, and the "access-denied" attribute gives
> determines the resource
> to use to handle when a user is not in the proper role.  Lastly, we have
> the entries that
> determine where flow moves.  There are three ways of determining the
> next action in a
> flow:
> * The "next" attribute
> * The "exit" attribute
> * The "choice" element
> The "next" and "exit" attributes function similarly, as they specify a
> static destination.
> They differ in that the "next" attribute specifies a resource and the
> "exit" attribute
> specifies a URL.
> The "choice" element allows you to specify a Request parameter to
> inspect for a selection
> of destinations.  You must provide a default value so that the flow is
> never broken.  The
> default is chosen if the parameter specified does not exist or does not
> contain any of the
> specified values.

I really like and understand the design you are explaining. And I will
admit it is better designed that the sitemap syntax we have today
because today we know better what all we'd like to have the sitemap to
do for us.  This comes from the fact that after the initial design some
more components/constructs needed to be integrated even if we have spent
alot of time designing it in the first place. These additions might not
fit well thought into it.

I also like to see such a syntax be implemented because I cn see it is
much more complete that what we have today.

But, Berin, be honest. Do you think the learning curve is getting
plainer with your syntax?

> ------------
> This whole solution can be put together without compiling the resources.
> In fact, I would
> much rather it be done that way.  By depending on a dynamically compiled
> resource we increase
> the risk that our site will fail.

This in a implementation detail which I think can be managed.

Thanks for your interesting RT and I really hope we can come up with a
better design of the sitemap syntax.


To unsubscribe, e-mail:
For additional commands, email:

View raw message