cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: Variations on a theme by Cocoon
Date Wed, 16 Feb 2000 17:09:37 GMT
Pierpaolo Fumagalli wrote:
> 
> Stefano Mazzocchi wrote:
> >
> > well, the line between them is thinner: we're not talking about
> > parameter-free producers, but producers that contain some information
> > that was generated from another source that can change independently.
> 
> When I mean "parameter free" I mean the fact that they don't require any
> input file name as a parameter...

Ah, ok.

> If you see thru CVS logs, when we still had the "job" object, instead of
> the request response, I was passing the string parameter "source" to the
> Producer, that was telling him about what source file to use...
> 
> > In case of DummyProducer, we have no parameters and no external original
> > file... everything was done by hand and there is nothing that can change
> > externally to trigger a "regeneration" of this producer.
> >
> > Both FileProducer and DummyProducer are immutable in such sense.
> >
> > But if you take hello.xml and compile it into a producer, the code you
> > end-up with is _exactly_ like dummyproducer, but it's behavior is very
> > different: if hello.xml is updated, the system _must_ trigger the
> > regeneration of the producer, before executing it again.
> >
> > Extending this into the most complex case, "generated producers" are
> > linked to their generation process.
> >
> > At the end, FileProducer is a very special case of a "generated
> > producer" where the generator is a human being and it recompiles the
> > thing by hand if something changes.
> >
> > Eventually, all producers can be seen as "generated producers", and for
> > this reason they expose _two_ ergodic periods:
> >
> > 1) the period of their logic generation
> > 2) the period of their generated data
> >
> > which are totally incorrelated.
> 
> Agreed... You didn't understand what I meant when talking about
> "producer without parameters" :)

no problem.
 
> > hmmm, not quite. Instead of treating "generated producers" as special
> > cases (with their own classloaderproducer) I'd treat fileProducer as a
> > special case of a externally generated producer which cannot change
> > during execution.
> 
> Ok... See the example below...
> 
> > Hmmmm... what if all producers are supposed to be "generated" and only
> > special ones are not? something like they have a null instead a
> > Modificable chain? just thinking out loud.
> 
> Think out loud as long as you can... Because I'm following you... And I
> think I got how to do it...

Yesss, I think you did..
 
> > [...]
> > > So, IMVHO, we have to deal with one more "element" in our processing
> > > chain, this "Generator": and the chain itself changes: from
> > >
> > > producer -> filter* -> serializer
> > >
> > > it becomes
> > >
> > > generator -> filter* -> serializer
> > >
> > > and it's the task of the producer to retrieve the appropriate generator
> > > from the request parameter.
> >
> > I like this name change: in fact producer is something more than the
> > general "producer/consumer" pattern. I like "generator" a lot better.
> > But it should take the producer's place, not requiring another
> > processing step...
> 
> Ok... let's call it generator from now on... Also because we have
> XMLProducer and XMLConsumer down in the SAX api (producing SAX events,
> and consuming SAX events.. wrapper classes for content and lexical
> handlers...)

Exactly, I like these patterns:

 generator implements producer
 filter implements producer, consumer
 serializer implements consumer 

very nice, very symmetric.

> > > Also, this makes things easier since only the Producer is "driven" by
> > > the request parameters (meaning that "it changes its behaviour depending
> > > on the request data").
> >
> > Hmmm, bad naming. Call it Producer or Generator, I don't want _both_ to
> > coexist. We need a pattern to define "an entity able to produce XML
> > content and able to reflect the changes to all the other entities
> > involved in the production".
> 
> Ok... Understood...
> 
> > These other entities are request/session parameters as well as data
> > repositories such as files, DBMS, directory servers, whatever...
> >
> > In fact, even FileProducer could save the parsed DOM internally and
> > behave as a "collection of compiled static pages". I think there is no
> > real difference between a producer and a generator, but we _have_ to
> > take into account the possibility of "generated producers" and consider
> > those who are not special cases, not the other way around.
> 
> Interesting concept... Also because I have already implemented some SAX
> serialization classes... It's pointless to re-parse the same source
> (validating it, maybe, against a HUGE schema) over and over just because
> the stylesheet changed... We can avoid the parsing and store the already
> parsed document, and go down to the parser ONLY when the source XML has
> been modified...

right... totally! Memory will be the problem, but if you need something
like this, memory is the easiest problem to solve problem :)
 
> > > I just hope it makes sense...
> >
> > It really does. I'm happy we are still tuned even if we have half a
> > planet in between :)
> 
> :) hehehehe :) just a little bit of psychic abilities :)
> 
> -----------------------------------------------------------------------
> 
> Ok... Let's see if this one can work.... (I write down the different
> sitemap configurations because if find it to be easier to work on "real"
> examples...)
> 
> <process uri="*.html" source="sources/xml/*.xml">
>   <filter name="xslt">
>     <parameter name="stylesheet" value="doc-html.xsl"/>
>   </filter>
>   <serializer name="html"/>
> </process>
> 
> This entry means: if I get a request for "*.html" (index.html,
> foo.html...) the source file is "sources/*.xml" (sources/index.xml,
> sources/foo.xml). I have no "generator" in the sitemap, so, I assume,
> the source file is an XML file, and I call directly the parser.
> To this document I apply the "doc-html.xsl" stylesheet and I serialize
> it as an HTML file.

Hmmmm, being picky, I see asymmetry on this: there is a "default", or
"hidden" behavior that is generation from file parsing. Is this _so_
important? Does it deserve such a special place? (not questioning, just
wondering myself)

what about Generation from URL parsing? what about "adaptation" from
other type of streams (serial lines for example)?

> <process uri="custom/*.html">
>   <generator class="org.betaversion.generators.MyGenerator"/>
>   <filter name="xslt">
>     <parameter name="stylesheet" value="foobar.xsl"/>
>   </filter>
>   <serializer name="html"/>
> </process>
> 
> This entry means: if I get a request for "custom/*.html"
> (custom/index.html, custom/foo.html), I cannot associate any source
> file, BUT since I have a generator, I simply handle the request to the
> specified class ("org.betaversion.generators.MyGenerator"), that will
> take care of generating XML data.
> The specified class can then, generate data from the request (I can send
> an XML document in POST and it will produce XML, it takes the request
> parameters and spit them all out in XML... blablalba).
> Then, again, I apply a stylesheet and serialize the stuff as HTML.
> 
> <process uri="text/*.html" source="sources/text/*.txt">
>   <generator class="org.betaversion.generators.CSVParser"/>
>   <filter name="xslt">
>     <parameter name="stylesheet" value="csv-html.xsl"/>
>   </filter>
>   <serializer name="html"/>
> </process>
> 
> This entry means: If I get a request for "text/*.html" I associate a
> source file called "sources/text/*.txt" (I don't repeat the examples).
> The source file name is passed to the specified generator (so, not
> handled by the parser), in my case a Comma Separated Values text
> parser... Wich will do a lot of super duper things and convert magically
> all the stuff into XML...
> Then, again, I apply a stylesheet and serialize the stuff as HTML.
> 
> <process uri="xsp/*.html" source="sources/xsp/*.xsp">
>   <generator>
>     <filter name="xslt">
>       <parameter name="stylesheet" value="xsp-java.xsl"/>
>     </filter>
>     <serializer name="javac"/>
>   </generator>
>   <filter name="xslt">
>     <parameter name="stylesheet" value="doc-html.xsl"/>
>   </filter>
>   <serializer name="html"/>
> </process>
> </sitemap>
> 
> I match xsp/*.html, I associate the source sources/xsp/*.xsp. I parse
> the source, apply the xsp-java.xsl filter, and serialize it with JavaC
> (wich will create a nifty .class file).
> The class file becomes my producer, wich spits out XML, to whom I apply
> the stylesheet and (AGAIN) serialize as HTML...
> 
> <process uri="whatever/*.html" source="sources/whatever/*.xml">
>   <generator class="org.betaversion.interpreter.MyInterpreter">
>     <filter name="xslt">
>       <parameter name="stylesheet" value="whatever.xsl"/>
>     </filter>
>     <serializer name="xml"/>
>   </generator>
>   <filter name="xslt">
>     <parameter name="stylesheet" value="foobar.xsl"/>
>   </filter>
>   <serializer name="html"/>
> </process>
> </sitemap>
> 
> I match whatever/*.html, then parse the source file, (it's XML data),
> BUT, instead of giving it to the first filter (the one applying
> "foobar.xsl"), I filter it using "whatever.xsl" and serialize it in XML.
> Then the specified generator is called, with, AS A SOURCE, not
> "sources/whatever/*.xml", but the result of the first processing, here
> the generator (org.betaversion.interpreter.MyInterpreter) has the
> opportunity to parse the output of the first translation (or do whatever
> he wants with that file, it could not be an XML), generate XML data wich
> will be styled thru foobar.xsl, and serialized...

Brilliant! I mean it: I think this is a piece of art... but I have a
slightly different suggestion, see below

 <process uri="/~stefano/docs/*">
  <generator file="/home/stefano/docs/*.xml"/>
  ...
 </process>

this indicates that all the uris such as "/~stefano/docs/cocoon-handout"
or "/~stefano/docs/college-thesis" will be generated by reading the
corrisponding xml file from disk. Or something like this

 <process uri="/~stefano/news/slashdot">
  <generator url="http://www.slashdot.org/xml/"/>
  <filter type="stuff-stefano-likes"/>
  ...
 </process>

or

 <process uri="/~stefano/mail">
  <generator class="org.apache.jetspeed.JetSpeed">
   <property name="server" value="xml.apache.org"/>
   <property name="user" value="stefano"/>
   <property name="password" value="Uwish:)"/>
  </generator>
  ...
 </process>

or
 
 <process uri="/~stefano/address-book">
  <generator>
   <generator file="/home/stefano/data/address-book.xsp">
   <serializer type="xsp"/>
  </generator>
  ...
 </process>

where the serializer "xsp" accounts for multiple namespace-based taglibs
+ code creation + code compilation. (in fact, this is not really a
filtering, unless you want to specify all the filter on the sitemap
instead of reacting on the taglibs used internally in the XSP page
(anyway, this is an implementation detail at this point).

or even

 <process uri="/~stefano/">
  <generator>
   <generator file="/home/stefano/docs/home.xml"/>
   <serializer type="xml"/>
  </generator>
  ... 
 </process>

which avoids parsing overhead by "precompiling" the home.xml page (which
is static!) into a class that spits XML without parsing overhead. Note
that what's below

 <process uri="/~stefano/">
  <generator file="/home/stefano/docs/home.xml"/>
  ... 
 </process>

gives the exact same result, but could, in theory, behave differently if
the validation stage is very heavy and the update frequency is very low.
anyway, this is, again, an implementation detail.

At the end, I like Pier's sub-generator chain _very_much_, but I think
we should _always_ expose the generator. This would avoid the need of
complex tables and increase readability and reduce management costs
(since what you see is what you get!) 

> (shit, it took almost three hours to write down this message! now digest
> it :)

I think we're getting there... yes, the fog is slowly clearing...

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<stefano@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------
 Come to the first official Apache Software Foundation Conference!  
------------------------- http://ApacheCon.Com ---------------------


Mime
View raw message