cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <>
Subject Re: Squaring the sitemap circle...
Date Fri, 09 Jun 2000 00:04:40 GMT
Jonathan Stimmel wrote:
> On Fri, May 26, 2000 at 05:38:06PM +0200, Stefano Mazzocchi wrote:
> > An improved version of the sitemap has been just placed on the CVS
> > cocoon2 branch. I will refert to that one from now on.
> Interesting... I like. Comments on specific pieces follow (and bear
> in mind that I view cocoon from a live-server standpoint; I have
> less interest in the offline mode, so my views are admitedly biased).
> > 7) uri mapping should be powerful enough to allow every possible
> > mapping need
> Nit-pick: I think this wording is dangerous. I think it's more
> important to handle "most common" needs well, get it out the door,
> and learn from how people are actually using it. (Hmmmm... after
> a little more thought, I think I misinterpreted your intentions; by
> allowing people to create their own matchers, this requirement
> has already been met!)

Right on. The above wording is explicitly broad because we have a
powerful language and a pretty good (IMO) component model to extend our
mapping capabilities.
> > The main matching type, URI matching, is hardcoded into the sitemap
> > schema because, by far, [it is] the most used
> URI matching is almost certainly going to be part of every pipeline,
> but I'm not convinced it should be treated as a special case. The models
> I've seen thus far force the use of the URI as the primary matching
> agent. What if I want to organise my sitemap by output format (e.g.
> HTML vs. WAP)?

Good point.
> > Mount points allow sitemaps to be cascaded and site management workload
> > to be parallelized.
> > <mount
> >    uri="^/xerces-[j|c|p]/(.*)$"
> >    src:cvs="$1/xdocs/$2"
> >    rule="regexp"/>
> > <mount
> >    uri="faq/*"
> >    src:jar="jar:./apps/faq-o-matic.cocoon#*"/>
> I'm not sure I understand the purpose of the mount tag; to me it just
> looks like another version of the <process>. From the description, I would
> expect it to be more like mounting a filesystem (grafting a sitemap
> onto our URI tree), more like:
>   <mount uri="/faq" src="/faq/faq.xconf"/>

this works if the sitemap a file, but what if it's contained into a
subdirectory of a jar file?
> (BTW, there's a typo in your regex; "[j|c|p]" should be either "([jcp])"
> or "(j|c|p)". Not that it really matters in a nonfunctional prototype... =)

See why I don't like regexps? :-) thanks for pointing out, though.
> > <redirect to="dist/cocoon/*"/>
> Why such a specific tag? Perhaps something more generic is in order
> (action? response?). A related action would be setting the return
> code (not sure if this is a realistic example, just a first reaction).

The serializer has an attribute for return code.

I wrote this specific tag because <read> and <redirect> are the only
tags that don't require pipeline processing. But if you can think of
more, I'm happy to discuss this further.

> You know, in some ways this is kind of like a specialized serializer;
> both redirect and a true serializer send something to the client. Of
> course, the redirect doesn't convert an XML tree into a stream, so
> calling it a serializer would be a bit of a misnomer. I wonder if
> perhaps we could extend the concept of a Serializer to include
> this type of functionality. (I'm not sure I necessarily advocate
> generalizing serializer, it's just a thought...)

Since those actions are handled without entering the XML processing
pipeline, I don't think we need to. Also, note, Cocoon doesn't require
the functionalities Apache has here because we get called only to
perform our own stuff and we do XML or plain resources. Period.

Everything else is handled by the web server or servlet engine and
separates our concerns.

To be honest, probably we don't need <redirect>...

> ** Proposed modifications
> Currently, the sitemap is viewed as a collection of parts (the process
> tags). What if we viewed it as a tree (after all, isn't that what
> URIs and XML are?). 

Gee, same thing happened over in the James sitemap.... you are
transforming a declarative approach into a procedural one. Might be
easier for you (programmer) but could be harder for non programmers.

> The root of the tree would be a matcher (normally
> matching against URI, but other organizations would be possible - e.g.,
> my HTML vs. WAP example). Here's a quick example (the following would
> follow the Component declarations):
> <matcher type="uri">
>   <!-- Simple rule to cover random xml files -->
>   <match value="/*.xml">

very interesting idea...

>     <generate type="file" src="/*.xml"/>
>     <filter type="xslt">
>       <param name="stylesheet" value="/*.xsl"/>
>     </filter>
>   </match>
>   <!-- Here's our company directory, which our salespeople need
>        to access via their cell phones -->
>   <match value="/directory">
>     <generate type="sql" src="/directory.xml"/>
>     <matcher type="browser">
>       <default>
>         <filter type="xslt">
>           <param name="stylesheet" value="/directory-html.xsl"/>
>         </filter>
>         <serialize type="html"/>
>       </default>
>       <match value="Nokia">
>         <filter type="xslt">
>           <param name="stylesheet" value="/directory-wap.xsl"/>
>         </filter>
>         <serialize type="wap"/>
>       </match>
>     </matcher>
>   </match>
>   <match value="/ourStuff/*">
>     <!-- Who was the idiot who put our product info under this
>          directory?! We should redirect them to the newly created
>          /products/ directory -->
>     <response>
>       <header name="Location" value="/products/*">

Here you need to know that <redirection> is performed by adding a
"Location" header to the HTTP response. I'm pretty sure my girlfriend
doesn't know that :) (nor she wants to learn it)

This is way too low level.

>     </response>
>   </match>
>   <match value="/products">
>     <sitemap src="/products/sitemap.xconf"/>
>   </match>
>   <default>
>     <generate type="file" src="/errors/notfound.xml"/>
>     <filter type="xslt">
>       <param name="stylesheet" value="/errors/notfound.xsl"/>
>     </filter>
>     <response>
>       <header name="Status" value="404 Not Found">

Same as above.

>       <serialize type="html"/>
>     </response>
>   </default>
> </matcher>
> I can see some problems (what if you want an actual "*" in a
> response header?), but I think it gets my intentions across.

It does. It gives a valid alternative to the proposed schema.

> Please comment =)

My first comment (I want others to take on this before I influence their
judgement too much) is a quiz:

 The above example has a big mistake. Where is it?

Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<>                             Friedrich Nietzsche
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------

View raw message