forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: Semantic linking (Re: [VOTE] Usage of file.hint.ext convention)
Date Mon, 02 Sep 2002 15:39:01 GMT
Steven Noels wrote:
> Jeff Turner wrote:
> 
> <snip/>
> 
>> So, what's the difference between <link href="primer.html"> and <link
>> href="primer" content-type="text/html">? The same difference as between
>> "identifying a resource" and "identifying a resource representation".
>> The gods of the web have deemed that they are separate concerns; that
>> "resource" != "resource representation"; they have separate identifiers;
>> one a URI, the other a MIME type. Trying to identify both in one "href"
>> element is mixing concerns.
>>
>> Ahem. So there you go :) I fondly imagine this sort of thinking was going
>> through Steven's head when he -1'ed extra extensions in URIs.
> 
> 
> (not really answering your mail, just attaching myself to the righteous 
> thought-train in this thread ;-)
> 
> I must have been fondly out of my mind as usual, I guess, but here's a 
> summarization of an hour of intense mind battle in our offices just now, 
> being challenged by Marc (who is actually much smarter than me but has a 
> problem with his hard drive organization, hence his need for two 
> extensions):
> 
> There are three kind of sources being processed through Forrest's 
> request space:
> 
>                  Name                                   URI
> 
> 1) XML (xdoc, docbook, YourGrammar)                **.{rendition}
> 2) XML-isable non-XML (e.g. DTD documentation)     **.{hint}.{rendition}
> 3) non-XML sources (images, static HTML/PDF/etc)   **.{extension}
>    (detected by wrapping the pipelines
>    in a ResourceExistAction)
> 
> {rendition} being html, pdf, wml, svg, ...
> {hint} being dtdx ...
> 
> Examples:
> 
> 1) manual/users/concepts.html
>    pressreleases/2001-02-06.pdf
> 
> 2) dtdx/document-v11.html
>    /09/11/23.downloadstats.svg
> 
> 3) architecture.png
>    dist/forrest-src.tgz
> 
> That being said, I believe we can set up a sitemap (*the* Forrest 
> sitemap, which is the definitive reference for the URI space being 
> processed by Forrest) that handles these three types of sources with 
> only minimal prextensination [1] of our URI space.
> 
> 1) Using CAPs, we are able to describe how XML sources, dependant on 
> their grammar must be preprocessed to conform to the intermediate 
> format. People will be able to link to a named XML document, irrelevant 
> of the preproceesing required, using <link 
> href="path/name(.{rendition})"/> (and I must still read Jeff's analysis 
> of the merits of having an extension in the href linking attribute).

I don't like this.
Users must link sources, not renditions.

Having another attribute with content-type is much better.

> We were thinking along the lines of a configuration section in the 
> sitemap listing possible identifiers to assign documents to a certain 
> 'document class': public identifier, root element name, 
> xsi:SchemaLocation attribute,...
> 
> Configuration of the pipeline would then be done in a CAPAction, setting 
> sitemap parameters, i.e. selecting the correct 
> authoringformat2intermediateformat.xsl - I will expand on this if the 
> dust in my mind has settled (and Bruno has defined his implementation 
> strategy ;-)

Simply put, the link is expanded as Jeff says before processing the 
content; then we should be able to select the source file just given the 
name (no resource-exists needed), and process it from the CAP rules.

> The pipeline is basically divided in two parts: pre- and 
> post-intermediate format. The pre-IMF should not be 'visible' for the 
> document editor: he just authors a document using a certain grammar and 
> stores it on disk. The post-IMF contains the skinning, TOC aggregation, 
> etc...

Ok. (conceptually)

> Rendition finally is specified using the extension, and is part of the 
> post-IMF process (= part of the document author concern when creating a 
> link).

No, as an attribute.

> 2) Given the hint, the pipeline can be especially configured, i.e. 
> setting the Generator type to nekodtd for a DTD source - the rendition 
> is specified using the extension like in 1). The XML orginating from 
> those sources can than be subject to CAP-processing.

If the hint is given, yes.
If not the default rendition is used.

> 3) For non-XML sources, there is a ResourceExistAction wrapping the all 
> this checking if the resource being requested already exists on disk, 
> and if so, using its extension, <map:read>'s it to the browser/crawler.

No nead for resource-exists: if the names must be unique, it's just the 
pipeline that matches the *only* file and *then* given the extension it 
understands to read rather than generate.

> OK - this is only a short summary but I hope it is clear. Do we move 
> forward with this?

With the use of an attribute versus extension, yes.

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message