cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Re: [RT] Link Rewriting
Date Tue, 29 Apr 2003 06:50:20 GMT
On Mon, Apr 28, 2003 at 03:16:34PM +0100, Jeremy Quinn wrote:
> Hi All
> Welcome to my first RT!
> Sorry to start on a negative note, but IMHO, the current scheme for 
> LinkRewriting just does not make sense for dynamic sites.
> What the current scheme allows is for authors to use a simple set of 
> declarators for defining links, which are then translated into actual 
> URLs, (and in the samples) mapped directly to filespace (etc.).

Yes.  <link href="site:foo">, where 'foo' is your declarator, and the
link is translated into the actual URL <a href="foo.html">.

So LinkRewriterTransformer does the mapping:

declarator -> *.html, *.pdf etc (URL)

However, that was only half of the original Linkmap vision :)  I think
you're asking for the other half, which is declarator -> SystemID

See the original linkmap RT:

The declarator -> SystemID mapping in that RT was declared in a site.xml

<site dir="./content/xdocs">
  <index file="index.xml"/>
  <dreams file="dreams.xml"/>
  <faq file="faq.xml">
    <how_can_I_help xpath="/faqs/faq/question[@id='how_can_I_help']">
    <building_own_website xpath="/faqs/faq/question[@id='own_website']"/>

That is functionally equivalent to the XML in your post quoted below.

In the linkmap RT, I envisioned a Source that accepts a 'declarator' as a
@src, and emits XML from that declarator's SystemID.  Your idea of using
input modules with dynamic inputs, {site:{1}}, is a nicer implementation
than what I had in mind.

> What I believe a LinkRewriting infrastructure should offer is rather 
> different.
> Cocoon's sitemap is excellent at disconnecting any direct relationship 
> between URL and resource ID/Path (SystemID). Allowing the 
> re-implementation of storage schema independently of URL contract.
> What I feel makes more sense is this:
> 	URL == permanent contract == authoring link != SystemID
> What this means is, authors should write links using a static absolute 
> URL, the same one as the public contract for that particular piece of 
> information.
> When that URL is accessed, it should be mapped to a SystemID, allowing 
> independent re-implementation of the storage layer.

So we have URLs everywhere, and in the sitemap we do a lookup on a
linkmap to determine the actual source?  This kind of thing:

<map:match pattern="*.html">
  <map:generate src="content/xdocs/{linkmap:{1}}"/>

> If this is to be handled by an input module accessing a linkmap (before 
> Generation, rather than during Transformation), but requires that 
> dynamic sitemap URL fragments can be passed to input modules (as they 
> cannot currently).
>  see: <>
> The transformation stage is then merely one of relativising author's 
> URLs.

But that isn't the only use of link rewriting :)  The original idea was
that <link href="site:index"> could be translated into:

 - <a href="index.html"> if linked to from a *.html file
 - <a href="index.pdf"> if linked to from a *.pdf file
 - <a href="#index"> if linked to from an aggregate HTML
 - <fo:basic-link internal-destination="index"> if linked to from an
   aggregate PDF
 - ....

This is the value of the descriptor -> URL mapping.

Actually, Forrest doesn't yet support this neat stuff yet, but that was
the original idea and it will be implemented as soon as enough users
complain about broken 'site:' links in their PDFs :o)

Anyway, point is, there's two sites to 'linkmapping', and both are

> Most of what I have described here, is how most people use Cocoon.
> What is new here is the use of a LinkMap at the Generation stage to 
> de-couple URL from SystemID in a totally arbitrary way. A version of 
> the LinkMap idea that makes sense for Dynamic sites. This is what 
> requires changes to the way we are able to use input modules, as input 
> modules would provide a much cleaner path to handle this rather than 
> Transforming a generated LinkMap into CInclude tags to get the content.
> Does this make any sense to anybody?

Perfectly :)

Now let me play devils advocate..

If we assume the user knows how to edit the sitemap, is there any real
benefit in this extra level of declarator -> *.xml indirection?  What's
the difference between editing a sitemap.xmap, and editing a linkmap.xml
file?  Both just map declarators to SystemIds.

In CVS Forrest, we've organized the sitemap into functional layers:

LAYER 1       |   (each format or subdir handler in its own sub-sitemap)
*.xml         |
   various    |    docv11     faq    howto    docbook   community/*  ....
   xml types  |       \        |       |         |         /
                         DOCUMENT-V11 INTERMEDIATE FORMAT
LAYER 2       |                /       |               \
 Intermediate |    **body-*.xml     **menu-*.xml      **tab-*.xml
 HTML formats |               \        |               /
LAYER 3       |                     \|/       \|/
  Output      |                   *.html     *.pdf
  formats     |

The lowest layer (in forrest.xmap) is a bunch of *.xml matchers, which
provides the raw doc-v11 XML source.  For example, the faq.xml matcher
reads content/xdocs/faq.xml, transforms it with faq2document.xsl and
serves the content.

In effect, this defines a virtual filesystem of XML sources, completely
abstracted from the real filesystem.  For instance, to integrate RSS into
a Forrest site, one need define a single *.xml matcher that reads RSS
from somewhere and does a rss2document.xsl transformation.  The upper
layers don't care where they get their XML from, so you magically get a
HTML and PDF from the RSS:

So what I'm suggesting is that if your users are sufficiently
sitemap-savvy, and your sitemap is sufficiently modular, one can have all
the benefits of your proposed declarator -> XML linkmapping system.  In
fact it's much more powerful; how could a linkmap file handle a RSS file,
for example?  You'd need to tweak the sitemap to do the rss2document.xsl


> regards Jeremy

View raw message