forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Semantic linking (Re: [VOTE] Usage of file.hint.ext convention)
Date Mon, 02 Sep 2002 15:08:28 GMT
On Mon, Sep 02, 2002 at 02:31:25PM +0200, Nicola Ken Barozzi wrote:
> Jeff Turner wrote:
> >I gather that (one of) the problems being addressed in this thread is the
> >where-to-link-to problem. Eg, in index.xml:
> >
> >  Read our <link href="primer.html">Forrest Primer</link> ... 
> >
> >And apparently that's bad. So my first question: why bad?
> >If that's the only reason, why not do "lazy resolution" of links. In the
> >XML, link to something abstract:
> >
> >Read our <link href="primer">Forrest Primer</link> ...
> Interesting, nobody removed the extension alltogether yet :-)

A website is just a little corner of the web, so all the rules of the web
(REST) should apply.

Rule #1: A resource Identifier (URI) is not the same as a resource
Representation (HTTP response w/ content type). A resource may have
multiple representations, but they will all be identified by the same

(no, not sucked out of my thumb :) see

In our context, a link's href should hold an *identifier*. Eg, <link

However, obviously we need a way to indicate a desired resource
representation too. But that is a separate concern; it's not identifying
the *resource*, so it doesn't belong in the URI. Web browsers have a
Content-Type: header where the preferred representation is specified. I
think links should have something similar:

<link href="primer" content-type="text/html">

When resolving that link, Cocoon says "give me the HTML representation of
the 'primer' resource.

Just like web browsers, the content type is usually inferred from the
user's context. Users don't need to say "oh, and give me text/html
please"; it's inferred from the user agent (browser).

Likewise, links usually need only specify the URI, and let the content
type be inferred from the type of document that is doing the linking. Eg,
if index.xml is rendered to HTML, then <link href="primer">
gets automatically expanded to <link href="primer"
content-type="text/html">, at the time index.html is rendered.

So, what's the difference between <link href="primer.html"> and <link
href="primer" content-type="text/html">? The same difference as between
"identifying a resource" and "identifying a resource representation".
The gods of the web have deemed that they are separate concerns; that
"resource" != "resource representation"; they have separate identifiers;
one a URI, the other a MIME type. Trying to identify both in one "href"
element is mixing concerns.

Ahem. So there you go :) I fondly imagine this sort of thinking was going
through Steven's head when he -1'ed extra extensions in URIs.

So practically, how does one resolve:

  <link href="primer" content-type="text/html">
  <link href="primer" content-type="text/plain">
  <link href="primer" content-type="application/pdf">

With the 'header' selector I guess:

<map:match pattern="primer">
  <map:generate src="content/xdocs/primer.xml"/>

  <map:select type="header">
    <map:parameter name="header-name" value="Content-Type"/>

    <map:when test="text/html">
      <map:transform src="document2html.xsl"/>

    <map:when test="application/pdf">
      <map:transform src="document2fo.xsl"/>
      <map:serialize type="fo2pdf"/>


The Cocoon link trawler would also need to set the content type from the
<link content-type="..."> attribute.

> >And then in document2html.xsl, just append the ".html":
> >
> >  <xsl:template match="link">
> >    <a><xsl:attribute name="href"><xsl:value-of select="concat(@href,

> >    '.html')"/></xsl:attribute>
> >      <xsl:apply-templates/>
> >    </a>
> >
> >In, convert it to a <fo:basic-link>.
> >
> >So say a user has a PDF saved alongside all the XML files. Then <link
> >href="mypdf.pdf"> works as expected.
> >
> >All the world's problems solved by removing the extension instead of adding
> >extensions :)
> >
> >Please someone tell me where I lost the plot..
>   I save the files on my hd with an extension usually.
>   So I can have
>   myfile.xml
>   myfile.pdf
>   myfile.txt
> What gets used by cocoon to generate myfile.pdf? The rule is not *that 
> clear...

Given the link <link href="myfile">, we'd first examine the 'context', ie
which file *contains* the link. If it's a HTML file, the link gets
expanded to <link href="myfile" content-type="text/html">, and therefore
myfile.html gets linked to.

> Hmmm...
> Also, when I link, I want sometimes to link to a specific content-type.

Then you link explicitly:

<link href="myfile" content-type="text/html">

> Part of the problem is in fact in wanting more outputs for one input.
> You propose one input-one output.
> Anyway, I like the no-extension link, as it's nearer to my uri proposal...

I think your "primer.xml" is the same as my "primer". They're both
identifiers, independent of MIME type. Only difference is, "primer.xml"
as an identifier would look silly outside a filesystem, eg in an XML db.

> Then what about simply:
> - I can have only one file with a certain name in the dir
> - I can use extensions for my sake but they don't get used by Forrest


> - Forrest looks inside the file to see what it contains

Isn't that solving a different problem?

> - I always link to the name without extension

Yes, keep the information space clean and semantic.

> - If I want a particular doctype, the *link* URL is mypage/contenttype

Yes! :) Except rather than mypage/contenttype, have two separate


> - The extensions are created by Cocoon; we leave the 1-1 mapping on 
> extensions but keep them on the filename.
> This seems to solve it, right? (fingers crossed)
> -- 
> Nicola Ken Barozzi         
>             - verba volant, scripta manent -
>    (discussions get forgotten, just code remains)
> ---------------------------------------------------------------------

View raw message