cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT] [OT] Content_negotiation++;
Date Thu, 25 May 2000 12:16:15 GMT
Paul Russell wrote:
> 
> Hi all,
> 
> I hope I'm not infringing an unwritten copyright of Stefano's
> here, 

No, not at all. I'd love to have a collection of random thoughts from
everybody on this list, it would make it easier to create a coherent
model that fits all needs. The power of open source is that is also open
developped. I wish more things were open developped.

> but this e-mail *definately* fits into the catagory of
> a random thought, rather than an immediate proposition. It's
> also somewhat off topic; this thought is NOT a suggestion
> for something we could do with Cocoon - I just don't think
> it makes sense in that environment. It's a random thought
> that's been bouncing around my head, and I thought I'd share
> it with you.

cool, see my comments below
 
> Before I dive in and start wombling, note that my internet
> connection is currently as dead as a dodo, and so I can't
> check any namespaces or anything. This also has the side
> effect that I have absolutely no idea whatsoever when our
> mailserver will actually succeed in delivering it.
> 
> I've simplified a lot of this, because (a) I don't want
> to cloud matters, if I can avoid it, and (b) it's too
> damned late for me to think straight ;)
> 
> I've been thinking about content/presentation abstraction for
> a *long* time, and even more since I've been involved with
> Cocoon.
> 
> One thing has always been in the back of my mind..
> 
>   "Surely the server should be able to deliver content
>    in whatever format the client wants? More to the point,
>    surely I shouldn't always have to think about it?"

I had the exact same feeling when moving from to the sitemap mindset. In
fact, Cocoon1 PIs were created to _reduce_ the load of site management
by allowing each one to indicate _how_ their resource was to be
constructed.

But it failed to provide a central place for administration, thus
_increasing_ the site management workload later on and reducing resource
reuse.

What you are talking about was already discussed previously and named
"namespace reaction", this means that you indicate what transformations
to apply when a namespace is found.

We'll see the problems this reasoning has.
 
> Let's think about this. Currently in Cocoon we have either
> PIs (in Cocoon1.x) or a sitemap (Cocoon2). In each of these,
> we explicitly specify how to get from the source document
> to the browser. Now, in Cocoon2, we can use matchers to
> determine what format to send data to the client in, which
> is great. I can have exactly the same URI, and without
> changing the source document, I can pump it out in HTML,
> text, PDF, or if I'm feeling particularly adventurous (or
> plain sick - you decide) SVG, PNG or JPEG. Cocoon1.x offers
> similar facilities, albeit in a somewhat under-engineered
> fashion.
> 
> Real Soon Now, we'll be able to use something called
> 'content negotiation' to work out what format browers
> would prefer without having to inferr it from the URL
> or User Agent.

I would say: as soon as the sitemap matchers are implemented.
 
> The way this works is that the client sends an 'Accept:'
> header to the server specifying what types of data it can
> understand. This feature is somewhat underdeveloped in
> current browsers, but it will improve, particularly as
> technologies such as Cocoon become prevalent.

Apache is very good at this, even if most people don't know it. The
problem is that it's very good on static content only.
 
> Now, the seed that has been growing in my head (albeit on
> the back burner) for a good few months now is that it is
> potentially possible to take this concept further.
> 
> All XML documents have a 'namespace'. 

First problem:a  general well-formed XML document has "no" namespace.
But I agree that every "good-behaving" XML document should indicate its
namespace, just like any "good-behaving" document should use xlink for
links and RDF for metadata.

> This is a unique URI,
> which allows us to be *sure* we're dealing with the set of
> semantics we were expecting. As the XML content within
> Cocoon flows from the generator to the serializer, this
> namespace changes.

Yes.
 
> Now, take the following lump of (imaginary) config for a
> Cocoon-like system:
> 
> <negotiate>
>   <translate
>     from="http://xmlns.luminas.co.uk/uea/prosp/intro/"
>     to="http://xmlns.luminas.co.uk/uea/prosp/layout/"
>     filter="xslt">
>     <parameter name="stylesheet" value="intro-layout.xsl"/>
>   </translate>
>   <translate
>     from="http://xmlns.luminas.co.uk/uea/prosp/subj/"
>     to="http://xmlns.luminas.co.uk/uea/prosp/layout/"
>     filter="xslt">
>     <parameter name="stylesheet" value="subj-layout.xsl"/>
>   </translate>
>   <translation
>     from="http://xmlns.luminas.co.uk/uea/prosp/layout/">
>     to="http://www.w3c.org/1999/xhtml/"
>     filter="xslt">
>     <parameter name="stylesheet" value="layout-xhtml.xsl"/>
>   </translate>
>   <translation
>     from="http://xmlns.luminas.co.uk/uea/prosp/layout/"
>     to="http://www.wapforum.org/[...Iforget...]"
>     filter="xslt">
>     <parameter name="stylesheet" value="layout-wml.xsl"/>
>   </translation>
> </negotiate>

Ok, I see.

What you are writing is a "translation map", instructions for what
transformations to apply to come from one namespace to another. This is
why I call it "namespace reaction": you feed the processing engine with
instructions to allow it to understand, reacting on the namespace found,
what transformation to apply.
 
> The server could then build a node-edge graph of the
> translations in memory, and back propagate the target
> namespaces to preceeding nodes in the graph. For the
> above (very simple) config file, the graph would look
> like [namespaces shortened]:
> 
> [uea/prosp/intro] \                      / [xhtml]
>                    |-[uea/prosp/layout]-|
>  [uea/prosp/subj] /                      \ [wml]
> 
> The xml and xhtml namespaces would be back propagated
> towards the left of the diagram, so that the 'layout'
> node knows to go 'up' for xhtml, and 'down' for wml,
> and so that the 'intro' and 'subj' nodes know that
> they can reach 'layout', 'xhtml', and 'wml' by going
> 'right'.

Second problem: there are _infinite_ ways to move from one namespace to
another.

This is called _styling_ :-)

What you are proposing is a single-style, hardwired path between
namespaces, which is good only if there is one and only one
transformation style applied to move from namespace A to namespace B.

For example, suppose you have a document that contains three of your
namespaces: doc: for documents, sql: for sql-generated data, quote: for
SOAP-retrieved stocks information.

So you have

 origin := doc + sql + quote

now you want to generate a PDF report of this. How do I do it? you need

 transform1 := (sql -> doc)
 transform2 := (quote -> doc)
 transform3 := (doc -> docbook)
 trasnform4 := (docbook -> fo)
 serializer := (fo -> pdf)

which you can create with the above graph.

But now you want to create an SVG table out of your SQL data but still
maintain the PDF output, ok, then you need

 transform1 := (sql -> svg)
 transform2 := (quote -> doc)
 transform3 := (doc -> docbook)
 trasnform4 := (docbook -> fo)
 serializer := (fo + svg -> pdf)

[NOTE: each transformation _must_ copy all the namespaces it is not
programmed to transform, otherwise the whole thing collapses and order
of namespace application becomes vital]

Ok, cool, but your boss wants big and fancy graphics all over to print
their broshures, all with the same data and with a fancier 3D SVG graph.
You look at the namespace chain (your transformation skeleton) and you
find it's exactly the same, but you just have to apply another _flavor_
of transformation.

At the end, it could be possible to do "namespace reaction" only if you
declared all the possible ways to crawl the namespace trallis, attaching
your own indentifies at every path.

While I believe this would make a very appealing visualization GUI-based
sitemap authoring tools, I don't think this is a good model for sitemaps
where more flavors of the same MIME-type are to be expressed. (which is
very likely to be the case in Cocoon, given it's power).

> When a request comes in, it would be tagged with the
> destination namespace (xhtml, wml, svg, whatever...).

you are confusing MIME-types for namespaces. It is very likely that more
than one namespace partecipate directly in the creation of one single
MIME type (fo + svg -> pdf)

> When the source XML is parsed/generated, we discover
> its namespace from the root element, and go find
> ourselves that node in the graph. 

AHHHHH! no way! the root element has nothing to do with the namespaces
that can be found inside the document. You are confusing namespaces with
SGML-like doctypes, which, in a true XML world make very little sense.

> The node then looks
> at the destination namespace and forwards the SAX
> events to a filter and on to the destination node.
> This process continues until it gets to the destination
> node, at which point it's serialized, which brings me
> nicely onto the next point...
> 
> I've got to admit, up to this point, I've made a
> rather large simplification. Great. We can transform
> from one namespace to another, but how on earth
> are we going to get a png or a jpeg out? The obvious
> answer is to treat mime types in a similar manner,
> and build them into the node graph. This causes
> problems, because suddenly we're not dealing with
> filters, we're dealing with serializers, which have
> a SAX input stream, but a *binary* output stream.
> To be frank, I'm still pondering this bit.
> 
> I've explained *how* something like this could work
> (loosely, I admit), but the question now has to be
> "why on earth would you want to?".
> 
> The answer is (and I did warn you this was *not* an
> immediate proposition) that at the moment, you
> *wouldn't* (this thing is *not* in the Cocoon2 target
> area, as far as I'm concerned). The sitemap handles
> pretty much anything most people are going to throw
> at it for the forseable (and a hefty wadge that most
> people aren't <grin>)
> 
> Where I *can* see it being useful is where you're
> dealing with all kinds of different DTDs from a
> particular URI space, and matching becomes
> cumbersome. For example, imagine you had a project
> linked to CORBA. Everything in /objects/* was
> linked to a CORBA generator, so that /objects/<iiopID>
> retrieved the content of an object. You could
> potentially write a matcher, and put entries in the
> sitemap for each type of object. This could become
> somewhat cumbersome, particularly if you're targeting
> WAP and HTML and PDF, for example. Using the directed
> graph, you don't worry about it - just let the server
> work out the easiest way to translate the document
> into what the client wants.

I agree, but I've showed how namespace reaction works well only for
single-flavored transformations and I don't think this will be the case
in many situations. And, if it was the case, I'd rather use placeholders
for the transformation chain in the sitemap rather than forcing
namespace reaction and polluting the uri-based declarative model.

But I'm wide open to suggestions to integrate namespace reaction in the
sitemap if you find this simplifies your life in situations I can't
think of.
 
> I don't know, maybe this is a Really Bad Idea (tm),
> but I think in some situations; particularly inside
> large application servers where you've got lots of
> people administering the system. If people could
> upload a stylesheet, and just specify the source
> and destination namespaces and let the server work
> out when to use it, we save ourselves a lot of
> configuration nightmares.

Yes, this is clearly the idea scenario. I believe, on the other hand,
that the real scalability power is done with cascading sitemaps. Of
course, if namespace reaction is included in the sitemap, the cascading
capabilitiy will inherited, so this is not a clear argument against
namespace reaction.

Like I said, suggestions for integration are welcome.

> Anyway, that's my random thought for the evening,
> hope reading this e-mail wasn't a *total* waste
> of time for those who made it ;)

Not at all :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------



Mime
View raw message