Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm
Precedence: bulk
Reply-To: "Jakarta Commons Developers List" <commons-dev@jakarta.apache.org>
From: "Brett Henderson" <jakarta@bretth.com>
To: "'Jakarta Commons Developers List'" <commons-dev@jakarta.apache.org>
Subject: RE: [codec] StatefulDecoders
Date: Tue, 2 Mar 2004 09:55:49 +1100
Message-ID: <001401c3ffe0$575ce9d0$8263a88d@353661bh>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
In-Reply-To: <NBBBJGEAGJAKLIDBKJOPEEGJHJAB.noel@devtech.com>
Importance: Normal

Noel,

Sorry about the delay, I've been away for a few days.

> In general, I have long preferred the pipeline/event model to
> the approach
>=20
> that Alex had, where it would give data to the codec, and
> then poll it for
>=20
> state.  However, I don't see something in your implementation
> that I think
>=20
> we want.  We want to be able to have structured content handlers and
>=20
> customized events depending upon the content handler and the
> registered
>=20
> event handlers.  This could be particularly important in a streaming
>=20
> approach to MIME content.  And I also desperately want a
> regex in this same
>=20
> model.

You're right, my design has no concept of structured content. It was
developed to solve a particular problem (ie. efficient streamable data
manipulation).  If API support for structured content is required then =
my
implementation doesn't (yet) support it.

I'll use engine for the want of a better word to describe an element in =
a
pipeline performing some operation on the data passing through it.

An API aware of structured content shouldn't complicate the creation of
simple engines such as base64 which pay no attention to data structure.
Ideally, a structured API would extend an unstructured API and only =
those
engines requiring structured features would need to use it.

I'm having trouble visualising a design that supports structured content
without being specific to a particular type of structured content. Do =
you
have some examples of what operations you would like a structured data =
API
to support?  Do you see interactions between pipeline elements being
strongly typed?

My design uses the concepts of producers and consumers, I'd like to see
those ideas preserved.  Engines are both consumers and producers but the
first and last elements in a chain (or pipeline) are only producers and
consumers respectively allowing I/O to be decoupled from the pipeline
operations. For example, my design uses an OutputStreamConsumer to write
pipeline result data to an OutputStream, OutputStreamProducer to receive
data written to an OutputStream and pass into a pipeline, and
InputStreamProducer to pump data from an input stream and pass into a
pipeline.

A structured content API can extend the producer/consumer ideas by =
passing
data types understood by the structured content in question.

For example, a multipart mime decoding engine (consumer of byte data, =
hence
a ByteConsumer) could produce MIME parts (a MIMEPartProducer). A
MIMEPartConsumer design would receive MIMEPart objects (which are in =
turn
ByteByteEngines but extended with a MIME type property) and connect them =
to
a consumer capable of handling the byte data contained in the MIME part.

The above example would involve the definition of several new interfaces
(MIMEPart extending ByteByteEngine adding mime type property, =
MIMEProducer
extending Producer, MIMEConsumer extending
Consumer) and new classes to implement the new interfaces with the =
behaviour
desired.

Any other structured content types could be handled in similar ways with =
new
"event" types being defined and relevant producer and consumer =
interfaces
created to support them.

Perhaps a more generic method can be devised but weak typing and =
degraded
performance are hard to avoid.


> Drop the word "conversion".

Yep, agreed.

> Conversion is simply one of many possible
>=20
> operations.  These are pipelines; receiving content on one
> end, performing
>=20
> operations, and generating events down a chain.  More than
> one event could
>=20
> be generated at any point, and the chain can have multiple paths.

If the above can be achieved without introducing a large overhead (both
runtime and coding overhead) for simple operations then it sounds good.  =
Is
it worth considering the possibility of a pipeline receiving data from =
more
than one source?  This may be necessary when composing multipart MIME
messages.  Then again, a multipart MIME consumer class may be a better
solution using similar ideas to those described earlier (ie. A
MIMEPartConsumer which combines all parts into a single byte stream).

I'm not sure how much sense I've made above, hopefully some ;-)

Brett


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org