forrest-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nicola Ken Barozzi <nicola...@apache.org>
Subject Re: Multiple file output formats and views (skinned-source-translation-etc)
Date Sun, 09 May 2004 21:18:27 GMT
Nicola Ken Barozzi wrote:
...
> How does Apache HTTP deal with this? What standards are there that we 
> can make of use?

Replying to myself:

I have read the HTTP1.1 protocol spec about content negotiation.

    ftp://ftp.isi.edu/in-notes/rfc2616.txt

Other infos and extra notes about the subject in general:

http://www.apacheweek.com/features/negotiation
http://www.imc.org/ietf-medfree/index.html
http://www.ietf.org/rfc/rfc2534.txt
http://www.ietf.org/rfc/rfc2913.txt
http://www.ietf.org/rfc/rfc2912.txt
http://www.ietf.org/rfc/rfc2938.txt

The problem of using it is here:

   http://norman.walsh.name/2003/07/02/conneg

Along with the important fact that browsers are not capable of 
agent-driven negotiation :-(


So I turned to "Architecture of the World Wide Web"

   http://www.w3.org/TR/webarch/

I found two important parts:

   http://www.w3.org/TR/webarch/#uri-opacity

"Good practice: URI opacity

Agents making use of URIs MUST NOT attempt to infer properties of the 
referenced resource except as licensed by relevant specifications.
"

This means that ideally browsers should not depend on the "extensions" 
of the filenames in the URIs to infer content type. Reality: 
unfortunately IE does, and all major OSes do for static files.

Furthermore:

   http://www.w3.org/TR/webarch/#internet-media-type

"
Good practice: Fragment identifier consistency

A resource owner who creates a URI with a fragment identifier and who 
uses content negotiation to serve multiple representations of the 
identified resource SHOULD NOT serve representations with inconsistent 
fragment identifier semantics.
"

This means that the same URI should show the same data, not something 
else. So, a page in html or pdf can be gotten from the same URI, but the 
documentation for that html source code is a different thing, as not 
semantically equivalent, and should have a different URI.


                           - = -

So, what I think we can evince from this is that

  1 - we should use links without extensions where possible
  2 - since the real word uses extensions, we can use extensions
      to define actual representations
  3 - we can use redirects to redirect the browser to a specific
      format (with extensions) from the one without
  4 - the extra "views" of the source should have their own URI
  5 - these URIs should differ in the filename, so that static
      representations can show the correlation and differences

Finally, MHO about languages is that they are in fact semantically 
equivalent, but that they should, for practicality, be handled as 
semantically different docs in the filename.

IOW:

    index
    index.pdf
    index-source.html
    index-source.pdf
    index-doc.html
    index.html.fr
    index.pdf.fr
    index-docs.pdf.fr

Give:

1 - index.html (later the best one based on content negotiation)
2 - index.pdf
3 - index-source.html
4 - index-source.pdf
5 - index-doc.html
6 - index.html.fr (in French)
7 - index.pdf.fr  (in French)
8 - index-docs.pdf.fr  (highlighted French docs)

There is just one issue with this: IIRC IE would choke on index.pdf.fr, 
and would like index.fr.pdf instead. Not sure if this change is worth it.

Enough thinking for tonight ;-)

-- 
Nicola Ken Barozzi                   nicolaken@apache.org
             - verba volant, scripta manent -
    (discussions get forgotten, just code remains)
---------------------------------------------------------------------


Mime
View raw message