Return-Path: Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 44769 invoked from network); 13 Oct 2000 20:28:32 -0000 Received: from pop.systemy.it (194.20.140.28) by locus.apache.org with SMTP; 13 Oct 2000 20:28:32 -0000 Received: from apache.org (pv12-pri.systemy.it [194.21.255.12]) by pop.systemy.it (8.8.8/8.8.3) with ESMTP id WAA28323 for ; Fri, 13 Oct 2000 22:28:27 +0200 Message-ID: <39E76F38.9A4A2EC4@apache.org> Date: Fri, 13 Oct 2000 22:23:20 +0200 From: Stefano Mazzocchi Organization: Apache Software Foundation X-Mailer: Mozilla 4.72 [en] (Windows NT 5.0; I) X-Accept-Language: en,it MIME-Version: 1.0 To: cocoon-dev@xml.apache.org Subject: Re: [RT] Content aggregation References: <39DA41FE.B02C015@apache.org> <39DB0916.CB2E1FCD@bali.ac> <39DB45F8.E9CE52E9@apache.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Rating: locus.apache.org 1.6.2 0/1000/N Niclas Hedhman wrote: > > > > ... > > > > > > > > > src="/www/graphics/fancy/titlebar2html.xsl"/> > > > > > > > > > src="/www/graphics/fancy/sidebar2html.xsl"/> > > > > > > > > > src="/www/graphics/fancy/content2html.xsl"/> > > > > > > > > > src="/www/graphics/fancy/data2chart.xsl"/> > > > > > > > > > > > > > > > > Hmmmm... > > > And that the branch handler will be a virtual response handler, saving the > > serialized content to temporary storage and feed each of the dyn-resources > > with the resultIDs of the other dyn-resources as parameters to each of > > them. (Cyclic dependencies makes it more interesting :o) ) > > Ah, you save them for later.... hmmmm. > > Give me some time to digest that. Ok, I did digest part of this and I'd like to continue the conversation on content aggregation. First let's summarize where we are now: 1) content is aggregated using an aggregating generator. 2) namespaces are used to identify the different semantic areas. One namespace identifies the general structure. 3) depending on the URI called, the aggregated content (hopefully cached) is passed to a different pipeline and the content is transformed/filtered, then serialized to the right resource. So far so good. As many suggested, the aggregation part is the easiest once we define how it works and a basic very high level structure schema. -------- o -------- So far, no special semantics have been added to the sitemap for content aggregation. The question is if we need it or not. Niclas suggested (see above) a 'forking pipeline' model. I honestly don't like it. Here many reasons, but the most important is: It breaks inversion of control: the pipeline decides what to do all at once, instead of waiting for the request to come. This is a very proxy-unfriendly behavior and goes against the patterns on which HTTP is based on. So, for now, I'd vote against it. Careful: I'm *NOT* saying that using a pure sitemap to come up with HTML framesets or tablesets just changing stylesheets is easy, no it's not. But simple things should be simple and complex things should be possible. So, let's try to outline a TODO list for implementing "content aggregation". Some thoughts: 1) an "aggregating" generator would need to have direct sitemap knowledge. Why is this? it's pretty stupid to include an internal XML resource after it has been serialized and then parse it again. The aggregator should skip the serialize/parse stages and redirect the SAX events to the consumer of the aggregated events. This would break SoC, so, forget an "aggregating generator": we need a specific sitemap semantics for the aggregation. Should we add the tag? does it make sense? I think it does. Should we have it pluggable like all the other actors? I'll let you decide this... for now it seems FS to me, but some of you might have good points on the need for a pluggable aggregator. 2) some URIs might behave differently depending on the fact that they are called from the sitemap internals or not. A matcher would need to have direct sitemap knowledge to know this, another SoC fracture, thus another semantic addition to the sitemap. What is the most readable and less verbose way to do this? I have some ideas but they all seem crappy to me. So I don't want to influence you. 3) The cache system needs to be improved to be able to cache aggregated content. This is another reason to ask for a different semantics from generator since the output of a generator will not be cached. The aggregator output is a stream of SAX events. Caching a stream of SAX events is the memory equivalent of the XML bytecode I was venturing to implement a couple of weeks ago.... I'll do another RT about that in the near future. That's it, all the remaining is a good example to show the power of it (our own documentation will be created out of content aggregation) and lots of documentation to explain it to people (I already announce I won't have time to contribute that in the near future so don't ask me!) Well, as usual, comments are welcome. After this, we will be ready for the first alpha release and hopefully a beta not much after :) -- Stefano Mazzocchi One must still have chaos in oneself to be able to give birth to a dancing star. Friedrich Nietzsche -------------------------------------------------------------------- Missed us in Orlando? Make it up with ApacheCON Europe in London! ------------------------- http://ApacheCon.Com ---------------------