cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <sylv...@apache.org>
Subject Re: [C3] StAX research reveiled!
Date Tue, 13 Jan 2009 14:33:49 GMT
Jakob Spörk wrote:
> Hello,
>
> I just want to give my thoughts to unified pipeline and data conversion
> topic. In my opinion, the pipeline can't do the data conversion, because it
> has no information about how to do this. Let's take a simple example: We
> have a pipeline processing XML documents that describe images. The first
> components process this xml data while the rest of the components do
> operations on the actual image. Now is the question, who will transform the
> xml data to image data in the middle of the pipeline? 
>
> I believe the pipeline cannot do this, because it simply do not know how to
> transform, because that’s a custom operation. You would need a component
> that is on the one hand a XML consumer and on the other hand an image
> producer. Providing some automatic data conversions directly in the pipeline
> may help developers that need exactly these default cases but I believe it
> would be harder for people requiring custom data conversions (and that are
> most of the cases).
>   

Absolutely. The discussion was about having the pipeline automate the 
connection of components that deal with the same data, but with 
different representations of it. Think XML data represented as SAX, 
StAX, DOM or even text, and binary data represented as byte[], 
InputStream, OutputStream or NIO buffers.

Let's consider your example. We can have:
- an XML producer that outputs SAX events
- an XML tranformer that pulls StAX events a writes SVG as StAX events 
in an XMLStreamWriter
- an SVG serializer that takes a DOM and renders it as a JPEG image on 
an output stream
- and finally an image transformer that adds a watermark to the image, 
reading an input stream and writing on an output stream.

The pipeline must not have the reponsibility of transforming data from 
one paradigm to another (i.e. an XML document to a jpeg image) because 
the way to do that highly depends on the application, But the pipeline 
should allow the component developers to use whatever representation of 
that data best fits their needs, and allow the user of not caring about 
the actual data representation as long as the components that are added 
to the pipeline are "compatible" (i.e. StAX, SAX and DOM are 
compatible). This can be achieved by adding the necessary transcoding 
bridges between components. And if such a bridge does not exist, then we 
can throw an exception because the pipeline is obviously incorrect.

Note that XML is a quite unique area where components can allow data to 
flow in one single direction through them (i.e. a SAX consumer producing 
SAX events). Most components that deal with binary data pull their input 
and push their output, which is actually exactly what Unix pipes do 
(read from stdin, write to stderr). So wanting a universal pipeline API 
that also works with binary data requires to address the push/pull 
conversion problem.

Sylvain

-- 
Sylvain Wallez - http://bluxte.net


Mime
View raw message