commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brett Henderson" <bre...@mail15.com>
Subject Streamable Codec Framework
Date Sun, 02 Nov 2003 23:46:48 GMT
Hi All,

I noticed Alexander Hvostov's recent email containing streamable
base64 codecs.  Given that the current codec implementations are
oriented around in-memory buffers, is there room for an
alternative codec framework supporting stream functionality?  I
realise the need for streamable codecs may not be that great but
it does seem like a gap in the current library.

I have done some work in this area over the last couple of months
as a small hobby project and have produced a small framework for
streamable codecs.

Some of the goals I was working towards were:
1. No memory allocation during streaming.  This eliminates
garbage collection during large conversions.
2. Pipelineable codecs.  This allows multiple codecs to be chained
together and treated as a single codec.  This allows codecs such as
base 64 to be broken into two components (base64 and line wrapping
codecs).
2. Single OutputStream, InputStream implementations which
utilise codec engines internally.  This eliminates the need to
produce a buffer based engine and a stream engine for every codec.
Note that this requires codec engines to be written in a manner
that supports streaming.
3. Customisable receivers.  All codecs utilise receivers to
handle conversion results.  This allows different outputs such as
streams, in-memory buffers, etc to be supported.
4. Direction agnostic codecs.  Decoupling the engine from the
streams allows the engines to be used in different ways than
originally intended.  Ie. You can perform base64 encoding
during reads from an InputStream.

I have produced base64 and ascii hex codecs as a proof of concept
and to evaluate performance.  It isn't as fast as the current
buffer based codecs but is unlikely to ever be as fast due to the
extra overheads associated with streaming.
Both base64 and ascii hex implementations can produce a data rate
of approximately 40MB/sec on a Pentium Mobile 1.5GHz notebook.
With some performance tuning I'm sure this could be improved,
I think array bounds checking is the largest performance hit.

Currently requires jdk1.4 (exception handling requires rework
for jdk1.3).
Running ant without arguments in the root directory will build
the project, run all unit tests and run performance tests.  Note
that the tests require junit to be available within ant.

Javadocs are the only documentation at the moment.

Files can be found at:
http://www32.brinkster.com/bretthenderson/BHCodec-0.2.zip

I hope someone finds this useful.  I'm not trying to force my
implementation on anybody and I'm sure it could be improved in
many ways.  I'm simply putting it forward as an optional approach.
If it is decided that streamable codecs are a useful addition to
commons I'd be glad to help.

Cheers,
Brett

PS.  Some areas that currently need improving are:
1. Exception handling requires jdk1.4, should be rewritten to
support older java versions.
2. BufferReceiver allocates memory continuously during streamed
conversions, should be fixed to recycle memory buffers.
3. Engines should have a new flush method added to allow them
to hold off posting to receivers until their internal buffers
fill up.  This would prevent fragmented buffers during
pipelined conversions.
4. OutputStream flush needs rework, shouldn't call finalize,
should call new flush method on CodecEngines.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message