Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cocoon-dev@xml.apache.org
Message-ID: <3DE16068.3040101@nada.kth.se>
Date: Mon, 25 Nov 2002 00:27:36 +0100
From: Daniel Fagerstrom <danielf@nada.kth.se>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
 rv:1.0.1) Gecko/20020823 Netscape/7.0
MIME-Version: 1.0
To: cocoon-dev@xml.apache.org
Subject: Re: [RT] Using pipeline as sitemap components (long)
References: <GMEBIBHGAOFGJCDPJANDEEGMFMAA.cziegeler@s-und-n.de>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit

Carsten Ziegeler wrote:
> Thanks Sylvain for this RT on a missing piece for the blocks concept.

I also find the concepts proposed by Sylvain useful and important and 
will discuss below how they can be realized by extending the 
cocoon:-protocol to be writable as well as readable.

> Sylvain Wallez wrote:
> 
>><snip/>
>>
>>
>>Pipelines as generators
>>-----------------------
>>
>>This leads to a first conclusion : using a pipeline as a generator means 
>>using the SAX events produced by the last XMLProducer of that pipeline, 
>>i.e. the last transformer or the generator if there are no transformers.
>>
> 
> This is only the technical implementations because of performance. The
> first implementation of the cocoon:-protocol actually required the xml
> serializer as the result was serialized and then parsed again - for
> performance reason we made it possible to use the SAX stream directly
> from the last XMLProducer.
> In fact, that the serializer does not have to be the xml serializer
> can be considered as a bug or weakness.

IMO this interpretation of what the cocoon:-protocol does is very 
important. We have two possible interpretation about what the 
cocoon:-protocol does: 1. It generates sax events from the last 
XMLProducer in the pipeline or 2. it generates octets from the 
serializer in the pipeline (which for the intended use cases is supposed 
to form an xml-document).

I find the first interpretation problematic. Consider the case when have 
a pipeline where the serializer is of pipeline type (i.e. it uses the 
transformers and the serializer from another pipeline), and we use it 
from the cocoon:-protocol in the generator in another pipeline. What 
would it mean to take away the serializer in this case? Skipping the 
pipeline serializer or skipping the serializer in the used pipeline?

With the second interpretation, i.e. the cocoon:-protocol always makes the
OutputStream from the serializer available, the example in the last 
paragraph should have an obvious interpretation. As the cocoon:-protocol 
is a protocol and such are normally supposed make an octet stream 
available (an InputStream), it seem more natural that it should deliver 
the octet stream from the serializer than an sax stream from the 
pipeline step before the serializer.

 From performance reasons a SAX stream should of course be available 
from the cocoon:-protocol if the serializer is of xml type. As a minor 
technical note: it would probably be more natural to let the 
xml-serializer (and possibly the proposed pipeline serializer) implement 
XMLProducer (or finding some other way to make sax events available from 
some of the serializers).

If we use a pipeline ending with e.g. a pdf-serializer from the 
cocoon:-protocol this should mean that it actually produces a pdf octet 
stream and that it would be an error to use it as source for e.g. the 
file generator. If we consider it FS to let the cocoon:-protocol 
generating anything else than xml, it should be considered an error to 
use a non xml producing serializer from the cocoon:-protocol.

<snip/>

>>Pipelines as serializers
>>------------------------
>>
>><snip/>
>>
>>How do we use this ? Well, just as for the generator, let's define a new 
>>"pipeline" serializer :
>>
>>  <map:generate src="another_xdoc.xml"/>
>>  <map:serialize type="pipeline" src="doc2pdf"/>
>>
>>Note : the "src" attribute doesn't currently exist on <map:serialize>, 
>>but it seems the more natural and consistent way to name the called 
>>pipeline. Wether this translates to implementing SitemapModelComponent 
>>or not is another story.
>>
> 
> Ok, I would call this "src" but something different, but that doesn't
> play a role for the concept itself.
> What I don't like is that a complete pipeline is called and there
> the generator is ignored. This would confuse everyone, I guess.

I agree, just ignoring the generator would be confusing. There seem to 
be two options: creating a new sitemap construction for pipelines 
without generator and or letting the generator do something meaningful. 
The first option has already been discussed in several comments, here I 
would like to see what happens if we try to follow the later path.

If we continue along the way hinted by Carstens comment about the 
cocoon:-protocol above, a pipeline as a serializer could be thought 
about as an writable protocol and as such making an writable octet 
stream (an OutputStream) available (as it is used as a serializer it is 
of course also supposed to generate output). We would thus like to make 
the cocoon:-protocol writable as well. Of course the writable 
cocoon:-protocol should make an content handler for sax input available 
when appropriate, but this is only for optimization purposes.

Here we immediately runs into the problem that generators not are the 
inverse to serializers. To get into more technical detail: a serializer 
implements SitemapOutputComponent which contains the method 
setOutputStream. The OutputStream is set by the context, in ordinary 
servlet usage it is set to the OutputStream of the HttpResponse object. 
This is analogue to unix pipelines where pipeline components writes to 
standard output and the context can redirect the standard output. 
Generators however do not implement anything corresponding 
"SitemapInputComponent"-interface, instead of getting their InputStream 
from the context, they are responsible for finding they are responsible 
for finding their input, (typically by applying a SourceResolver on the 
content of the src attribute). If we continue the unix analogue a 
generator would correspond to a command that ignore the standard input 
and instead requires a file name in the parameter list.

If we, as a thought experiment, would like generators to be the 
"inverse" to serializers they should implement a 
"SitemapInputComponent"-interface which would contain a setInputStream 
method or possibly a setInputSource method it should probably also 
contain a getContentHandler for efficiency reasons when the input 
happens to be xml. The input would then be set from the src attribute 
when existent and from the "standard input" set from the context 
otherwise. In servlet use, I guess that standard input would correspond 
to the InputStream from the request object, when used as a writable 
cocoon:-protocol, standard input would be set to whatever writes into 
the protocol.

Now I am of course completely aware that would be a bad idea to change 
the Generator interface instead one could maybe let the FileGenerator 
and the proposed PipelineGenerator (and possibly some other), implement 
some "WritableGenerator"-interface containing the methods listed above. 
Then the context, e.g. the writable cocoon:-protocol or the servlet 
could set the input for a WritableGenerator and ignore it for ordinary 
an ordinary Generator.


>>Pipelines as transformers
>>-------------------------
<snip/>

We can use the above described writable cocoon:-protocol to reuse 
pipelines as transformers as well, here we must of course require booth 
input and output to be xml. We could have something like:

<map:transform type="pipeline" src="cocoon:foo"/>

and

<map:transform type="pipeline" src="block:bar:foo"/>

or maybe a more general construct that also can use SOAP services or WebDAV:

<map:transform type="protocol" src="http://bar.org/foo"/>

-------------

What do you think?

Daniel Fagerstrom


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org