cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [RT] Content aggregation
Date Fri, 13 Oct 2000 20:23:20 GMT
Niclas Hedhman wrote:

> >  <generate type="aggretator">
> >   ...
> >  </generate>
> > <branch>
> >     <dyn-resource>
> >         <transform resultID="titlebar"
> > src="/www/graphics/fancy/titlebar2html.xsl"/>
> >         <serialize type="html"/>
> >     </dyn-resource>
> >     <dyn-resource>
> >         <transform resultID="sidebar"
> > src="/www/graphics/fancy/sidebar2html.xsl"/>
> >         <serialize type="html"/>
> >     </dyn-resource>
> >     <dyn-resource>
> >         <transform resultID="content"
> > src="/www/graphics/fancy/content2html.xsl"/>
> >         <serialize type="html"/>
> >     </dyn-resource>
> >     <dyn-resource>
> >         <transform resultID="stockchart"
> > src="/www/graphics/fancy/data2chart.xsl"/>
> >         <serialize type="svg2png"/>
> >     </dyn-resource>
> >     <dyn-resource resultID="structure" >
> >         <transform src="/www/graphics/fancy/structure2html.xsl" />
> >         <serialize type="html"/>
> >     </dyn-resource>
> > </branch>
> 
> Hmmmm...
> 
> > And that the branch handler will be a virtual response handler, saving the
> > serialized content to temporary storage and feed each of the dyn-resources
> > with the resultIDs of the other dyn-resources as parameters to each of
> > them. (Cyclic dependencies makes it more interesting :o) )
> 
> Ah, you save them for later.... hmmmm.
> 
> Give me some time to digest that.

Ok, I did digest part of this and I'd like to continue the conversation
on content aggregation.

First let's summarize where we are now:

1) content is aggregated using an aggregating generator.

2) namespaces are used to identify the different semantic areas. One
namespace identifies the general structure.

3) depending on the URI called, the aggregated content (hopefully
cached) is passed to a different pipeline and the content is
transformed/filtered, then serialized to the right resource.

So far so good. As many suggested, the aggregation part is the easiest
once we define how it works and a basic very high level structure
schema.

                    -------- o --------

So far, no special semantics have been added to the sitemap for content
aggregation.

The question is if we need it or not.

Niclas suggested (see above) a 'forking pipeline' model.

I honestly don't like it.

Here many reasons, but the most important is:

It breaks inversion of control: the pipeline decides what to do all at
once, instead of waiting for the request to come. This is a very
proxy-unfriendly behavior and goes against the patterns on which HTTP is
based on.

So, for now, I'd vote against it.

Careful: I'm *NOT* saying that using a pure sitemap to come up with HTML
framesets or tablesets just changing stylesheets is easy, no it's not.
But simple things should be simple and complex things should be
possible.

So, let's try to outline a TODO list for implementing "content
aggregation".

Some thoughts:

1) an "aggregating" generator would need to have direct sitemap
knowledge. Why is this? it's pretty stupid to include an internal XML
resource after it has been serialized and then parse it again. The
aggregator should skip the serialize/parse stages and redirect the SAX
events to the consumer of the aggregated events.

This would break SoC, so, forget an "aggregating generator": we need a
specific sitemap semantics for the aggregation.

Should we add the <map:aggregate> tag? does it make sense? I think it
does. 

Should we have it pluggable like all the other actors? I'll let you
decide this... for now it seems FS to me, but some of you might have
good points on the need for a pluggable aggregator.

2) some URIs might behave differently depending on the fact that they
are called from the sitemap internals or not. A matcher would need to
have direct sitemap knowledge to know this, another SoC fracture, thus
another semantic addition to the sitemap.

What is the most readable and less verbose way to do this? I have some
ideas but they all seem crappy to me. So I don't want to influence you.

3) The cache system needs to be improved to be able to cache aggregated
content. This is another reason to ask for a different semantics from
generator since the output of a generator will not be cached. 

The aggregator output is a stream of SAX events. Caching a stream of SAX
events is the memory equivalent of the XML bytecode I was venturing to
implement a couple of weeks ago.... I'll do another RT about that in the
near future.

That's it, all the remaining is a good example to show the power of it
(our own documentation will be created out of content aggregation) and
lots of documentation to explain it to people (I already announce I
won't have time to contribute that in the near future so don't ask me!)

Well, as usual, comments are welcome.

After this, we will be ready for the first alpha release and hopefully a
beta not much after :)

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Missed us in Orlando? Make it up with ApacheCON Europe in London!
------------------------- http://ApacheCon.Com ---------------------


Mime
View raw message