forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Turner <>
Subject Re: File prefix again (Re: Cocoon CLI - how to generate the whole site)
Date Tue, 17 Dec 2002 04:39:03 GMT
On Mon, Dec 16, 2002 at 04:08:37PM +0100, Nicola Ken Barozzi wrote:
> >>Why would we need to rewrite "file:"s?
> >
> >Given the above definition, what do you think the implied scheme for
> ><link href="hello.pdf"> is?  What syntactic and semantic restrictions are
> >there?  Can we link to anything?  No: we can only link to URIs defined by
> >sitemap rules.  Therefore the implied scheme is 'cocoon:'.  I need to
> >invoke Cocoon to get 'hello.pdf'.  If my editor were written in Java as
> >an Avalon component, it might really be able to invoke Cocoon and
> >retrieve 'hello.pdf'.
> >
> >What about when a file is sitting on my harddisk?  Do I need Cocoon to
> >view it?  No; I can open it in an editor.  Hence the 'file:' protocol is
> >implied.  In fact, in vim I can type 'gf' and automatically traverse the
> >link.  My editor is a 'browser' of the Source URI space, just like
> >Mozilla browses the Destination URI space.
> >
> >That is the important concept: the Source URI space is distinct from the
> >Destination URI space.  In the Source URI space (XML docs + <link>
> >elems), we have all sorts of schemes (linkmap:, java:, file:, person:
> >etc), but in the Destination URI space (HTML docs + <a> elems), we have
> >only one protocol, usually http: or file:.
> First distinction: schemes are not IMV in the source URI space, but in 
> the destination URI space

In the destination URI space (HTML files), all our linkmap:, java:,
person:, mail: schemes have vanished.  The only exist in the source URI
space (XML files).

> hence my definition of link rewriting. Links are always seen from the
> outside IMV.

I edit XML files, which are source docs.  I edit the source links.
Currently, most source links are identical to destination links, but that
is what will change completely once we introduce schemes.  There is no
way you can pretend <link href="linkmap:/primer"> is a destination link,
because browsers don't understand the 'linkmap' protocol.  Only Cocoon
can.  Just as Cocoon translates source docs (XML) to destination docs
(HTML), it translates source URIs (link:, java:, etc URIs) to destination

> With this in mind, you can infer why I don't see the need for a file:
> scheme.
> Thus I link to the resulting URI space, not the source one.

You do currently.  <link href="primer.html"> is a link to the destination
URI space.  But we have agreed that that is wrong.

> The resulting URI space can be complicated, so to ease the linking I
> use schemes to make linking easier.
> Well, it might as well be not the best thing to do, but this is what 
> I've been saying till now, so I see why we didn't really understand each 
> other.

Your view is perfectly clear and simple: schemes are aliasing mechanisms
to simplify linking to the destination URI space.

My view only makes sense once you a) buy into the notion that the Source
URI space exists and is distinct from the Destination URI space, b)
understand that, given a), the implied *source* protocol for links is
currently 'cocoon:'.  Only then does the reason for file: become
apparent: static links do _not_ have the implied 'cocoon:' scheme.  We
need a different scheme to disambiguate, say, a static index.pdf, and an
index.pdf generated from index.xml.

> >I described this notion of separating the Source and Destination URI
> >space in a RT:
> I read it, and I basically agree with it, except the above distinction 
> which wasn't clear to me in the first place.
> >So that is the theory: it is better to have an explicit file: scheme,
> >because it distinguishes those URIs from the implied 'cocoon:' scheme,
> >and fits in better in a world where there are schemes everywhere.
> Please expand on this. Do you mean file scheme=sources and cocoon 
> scheme=resulting URI space?


In a perfect world, the default scheme would be file:, not cocoon:.  So
we could have <link href="primer.xml">, or <link href="hello.pdf">.
Then, a linkmap would genuinely be an aliasing mechanism, but aliasing in
the _Source_ URI space.  Eg, <link href="site:/primer"> would be exactly
equivalent to <link href="primer.xml"> (or ../primer.xml or
../../primer.xml etc).  Ignore this paragraph if it doesn't make sense..

> >Practically, right now, what is the difference?
> >
> >Well for a start, if we consistently used 'file:' for URIs identifying
> >static files, we could throw away the current resource-exists action:
> >
> >  <map:match pattern="**">
> >
> >    <map:act type="resource-exists">
> >     <map:parameter name="url" value="content/{1}"/>
> >     <map:read src="content/{../1}"/>
> >    </map:act>
> >    ....
> >
> >And replace it with a simple sitemap rule:
> >
> >  <map:match pattern="file:**">
> >    <map:read src="content/{1}"/>
> >  </map:match>
> Which is something I don't like.
> Again, you are telling Cocoon how to treat that file, which is not a 
> concern of the editor.

The implied URI scheme is 'cocoon:'.  By adding a 'file:' prefix, the
user is saying "no, this file is local".  There is nothing wrong with
this, and no other way to distinguish between, say, a static index.pdf
and one generated from index.xml.  The sitemap simply takes advantage of
the lexical difference.

> We decided to take away the extension to files, but this file: thing 
> does the same conceptual thing, it selects the sitemap to use inside the 
> link.

The difference is, the file: scheme is not added to make the sitemap
simpler.  That is just a nice side-effect.

> >Having to interrogate the filesystem to decide a URI's scheme is a total 
> >hack.
> >What happens if our docs are stored in Xindice, or anything other than a
> >filesystem?  Resource-exists is going to break.
> Hmmm, this is a good point, but not a resource-exists "conceptual" 
> problem. I can test if a resource exists also in remote repositories.
> If the "file:" thing takes care different backends, there is no reason 
> why a better resource-exists cannot. So seems is more about the 
> deficiencies of the resource-exists implementation rather than the need 
> of a site: scheme.

Say I want to link to a static index.pdf, but I forget to create it.  I
want that link to break!  I don't want Cocoon to be clever, and create
one from index.xml.  Resource-exists is an utter hack that doesn't
(cannot!) meet use-cases like this, because ultimately, only the user can
know if they are referring to a local file, or one generated by Cocoon.

> >Secondly, introducing a 'file:' prefix fixes the current name clash
> >problem.  What if I have a static file called 'index.pdf'?  How do I
> >access the index.pdf generated from XML?  I can't, because the
> >resource-exists will always choose for me.
> Which is another seemingly good point, but since we have decided that 
> link URIs should not end in extensions, because of many reasons one of 
> which is the fact that a URI can reference different formats at 
> different times in history, having a scheme that effectively makes me 
> serve two different versions of the same file is totally off-target.

See above.  There is _no way_ that a sitemap, with MIMETypeActions and
resource-exists and any other crazy hacks you care to name, can 100%
correctly choose between a static index.pdf and one generated from
index.xml.  Simply cannot, because there is missing info only the user
knows.  That is what the file: prefix adds.


View raw message