cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Russell <>
Subject [RT] [OT] Content_negotiation++;
Date Thu, 25 May 2000 01:49:24 GMT
Hi all,

I hope I'm not infringing an unwritten copyright of Stefano's
here, but this e-mail *definately* fits into the catagory of
a random thought, rather than an immediate proposition. It's
also somewhat off topic; this thought is NOT a suggestion
for something we could do with Cocoon - I just don't think
it makes sense in that environment. It's a random thought
that's been bouncing around my head, and I thought I'd share
it with you.

Before I dive in and start wombling, note that my internet
connection is currently as dead as a dodo, and so I can't
check any namespaces or anything. This also has the side
effect that I have absolutely no idea whatsoever when our
mailserver will actually succeed in delivering it.

I've simplified a lot of this, because (a) I don't want
to cloud matters, if I can avoid it, and (b) it's too
damned late for me to think straight ;)

I've been thinking about content/presentation abstraction for
a *long* time, and even more since I've been involved with

One thing has always been in the back of my mind..

  "Surely the server should be able to deliver content
   in whatever format the client wants? More to the point,
   surely I shouldn't always have to think about it?"

Let's think about this. Currently in Cocoon we have either
PIs (in Cocoon1.x) or a sitemap (Cocoon2). In each of these,
we explicitly specify how to get from the source document
to the browser. Now, in Cocoon2, we can use matchers to
determine what format to send data to the client in, which
is great. I can have exactly the same URI, and without
changing the source document, I can pump it out in HTML,
text, PDF, or if I'm feeling particularly adventurous (or
plain sick - you decide) SVG, PNG or JPEG. Cocoon1.x offers
similar facilities, albeit in a somewhat under-engineered

Real Soon Now, we'll be able to use something called
'content negotiation' to work out what format browers
would prefer without having to inferr it from the URL
or User Agent.

The way this works is that the client sends an 'Accept:'
header to the server specifying what types of data it can
understand. This feature is somewhat underdeveloped in
current browsers, but it will improve, particularly as
technologies such as Cocoon become prevalent.

Now, the seed that has been growing in my head (albeit on
the back burner) for a good few months now is that it is
potentially possible to take this concept further.

All XML documents have a 'namespace'. This is a unique URI,
which allows us to be *sure* we're dealing with the set of
semantics we were expecting. As the XML content within 
Cocoon flows from the generator to the serializer, this
namespace changes.

Now, take the following lump of (imaginary) config for a
Cocoon-like system:

    <parameter name="stylesheet" value="intro-layout.xsl"/>
    <parameter name="stylesheet" value="subj-layout.xsl"/>
    <parameter name="stylesheet" value="layout-xhtml.xsl"/>
    <parameter name="stylesheet" value="layout-wml.xsl"/>

The server could then build a node-edge graph of the
translations in memory, and back propagate the target
namespaces to preceeding nodes in the graph. For the
above (very simple) config file, the graph would look
like [namespaces shortened]:

[uea/prosp/intro] \                      / [xhtml]
 [uea/prosp/subj] /                      \ [wml]

The xml and xhtml namespaces would be back propagated
towards the left of the diagram, so that the 'layout'
node knows to go 'up' for xhtml, and 'down' for wml,
and so that the 'intro' and 'subj' nodes know that
they can reach 'layout', 'xhtml', and 'wml' by going

When a request comes in, it would be tagged with the
destination namespace (xhtml, wml, svg, whatever...).
When the source XML is parsed/generated, we discover
its namespace from the root element, and go find
ourselves that node in the graph. The node then looks
at the destination namespace and forwards the SAX
events to a filter and on to the destination node.
This process continues until it gets to the destination
node, at which point it's serialized, which brings me
nicely onto the next point...

I've got to admit, up to this point, I've made a
rather large simplification. Great. We can transform
from one namespace to another, but how on earth
are we going to get a png or a jpeg out? The obvious
answer is to treat mime types in a similar manner,
and build them into the node graph. This causes
problems, because suddenly we're not dealing with
filters, we're dealing with serializers, which have
a SAX input stream, but a *binary* output stream.
To be frank, I'm still pondering this bit.

I've explained *how* something like this could work
(loosely, I admit), but the question now has to be
"why on earth would you want to?".

The answer is (and I did warn you this was *not* an
immediate proposition) that at the moment, you
*wouldn't* (this thing is *not* in the Cocoon2 target
area, as far as I'm concerned). The sitemap handles
pretty much anything most people are going to throw
at it for the forseable (and a hefty wadge that most
people aren't <grin>)

Where I *can* see it being useful is where you're
dealing with all kinds of different DTDs from a
particular URI space, and matching becomes
cumbersome. For example, imagine you had a project
linked to CORBA. Everything in /objects/* was
linked to a CORBA generator, so that /objects/<iiopID>
retrieved the content of an object. You could
potentially write a matcher, and put entries in the
sitemap for each type of object. This could become
somewhat cumbersome, particularly if you're targeting
WAP and HTML and PDF, for example. Using the directed
graph, you don't worry about it - just let the server
work out the easiest way to translate the document
into what the client wants.

I don't know, maybe this is a Really Bad Idea (tm),
but I think in some situations; particularly inside
large application servers where you've got lots of
people administering the system. If people could
upload a stylesheet, and just specify the source
and destination namespaces and let the server work
out when to use it, we save ourselves a lot of
configuration nightmares.

Anyway, that's my random thought for the evening,
hope reading this e-mail wasn't a *total* waste
of time for those who made it ;)


Paul Russell                               <>
Technical Director,         
Luminas Ltd.

View raw message