cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grzegorz Kossakowski <>
Subject Re: [C3] StAX research reveiled!
Date Mon, 12 Jan 2009 20:53:53 GMT
Jakob Spörk pisze:
> Hello,

Hello Jakob,

> I just want to give my thoughts to unified pipeline and data conversion
> topic. In my opinion, the pipeline can't do the data conversion, because it
> has no information about how to do this. Let's take a simple example: We
> have a pipeline processing XML documents that describe images. The first
> components process this xml data while the rest of the components do
> operations on the actual image. Now is the question, who will transform the
> xml data to image data in the middle of the pipeline? 

I agree with you that pipeline implementation should not handle data conversion because there
is no generic way to
handle it.

Now I would like to answer your question: it should be another /pipeline component/ that handles
data conversion.

> I believe the pipeline cannot do this, because it simply do not know how to
> transform, because that’s a custom operation. You would need a component
> that is on the one hand a XML consumer and on the other hand an image
> producer. Providing some automatic data conversions directly in the pipeline
> may help developers that need exactly these default cases but I believe it
> would be harder for people requiring custom data conversions (and that are
> most of the cases).
> The actual architecture allows to fit any components into the pipeline, and
> only the components itself have to know if they can work with their
> predecessor or the component following them. That allow most flexibility
> when thinking about any possible conversions. If a pipeline should do this,
> you would need "plug-ins" for the pipeline that are registered and allow the
> pipeline to do the conversions. But then, it is the responsibility of the
> developer to register the right conversion plug-ins and you would have get
> new problems if a pipeline requires two different converters from the same
> to the same data type because the pipeline cannot have automatically the
> information which converter to use in which situation.

I believe that these problems could be addressed by... compiler. In my opinion, pipelines
should be type-safe which
basically means that for a given pipeline fragment you know what it expects on the input and
what kind of output it
gives to you. The same goes for components. This eliminates "flexibility" of having a component
that accepts more than
one kind of input or more than one kind of output. I believe that having more than one output
or one input only adds to
complexity and does not solve any problem.

If component was going to accept more than one kind of input how a user could know the list
of accepted inputs? I guess
the only way to find out would be checking source and looking for all "instanceof" statements
in its code.

I would prefer situation when components have well-defined type of input and output and if
you one to combine components
for which input-output pairs do not match you should add converters as intermediate components.

I've been thinking about generic but at the same time type-safe pipelines for some time. I've
designed them on paper and
everything looked quite promising. Then moved to implementation of my ideas and got rather
disappointing result which
can be seen here:

The most interesting files are:
(generic and
type-safe pipeline interface)
(generic and type-safe component def.)
(shows how to use that thing)

> The only thing cocoon can help here with is to provide as much "standard"
> converters for use as possible, but it is still the responsibility of the
> developer to use the right ones.

I think Cocoon could define much better, type-safe Pipeline API but we are in unfortunate
situation that we are using
language that makes it extremely hard to express this kind of generic solutions.

Of course, I would like to be proven that I'm wrong and Java is powerful enough to let us
express our ideas and solve
our problems. Actually, the whole idea of pipeline is not a rocket science as it's, in essence,
just ordinary function
composition. The only unique property of pipelines I can see is that we want to access to
_partial_ results of pipeline
execution so we can make it streamable.

This become more a brain-dump than a real answer to your e-mail Jakob, but I hope you (and
others) have got my point.

Best regards,
Grzegorz Kossakowski

View raw message