commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Brett Henderson" <bre...@mail15.com>
Subject RE: [codec] Streamable Codec Framework
Date Mon, 10 Nov 2003 03:26:15 GMT
I think the design of the codec framework could cover
your requirements but it will require more functionality
than it currently has.

> > > > Some of the goals I was working towards were:
> > > > 1. No memory allocation during streaming.  This eliminates
> > > > garbage collection during large conversions.
> > > Cool. I got large conversions... I'm already at
> > > mediumblob in mysql , and it goes up/down XML
> > stream
> > > :)
> > 
> > I have a lot to learn here.  While I have some
> > knowledge
> > of XML (like every other developer on the planet), I
> > have never used it for large data sets or used SAX
> > parsing.
> > Sounds like a good test to find holes in the design
> > :-)
> 
> It's easy. You got callback, where you can gobble up
> string buffers with incoming chars for element
> contents.  ( and there is a lot of this stuff... )
> After tag is closed, you have all the chars in a big
> string buffer, and get another callback - in this
> callback you have to convert data, and do whatever
> necessary ( in my case, create input stream, and pass
> it to database ) 

This could be tricky, it's something I've been thinking
about but would like feedback from others about the best
way of going about it.

The data you have available is in character format.
The base64 codec engine operates on byte buffers.
The writer you want to write to requires the data
to be in character format.

I have concentrated on byte processing for now because
it is the most common requirement.  XML processing
requires that characters be used instead.

It makes no sense to perform base64 conversion on
character arrays directly because base64 is only 8-bit
aware (you could split each character into two bytes
but this would blow out the result buffer size where
chars only contain ASCII data).

I think it makes more sense to perform character to
byte conversion separately (perhaps through
extensions to existing framework) and then perform
base64 encoding on the result.  I guess this is a
UTF-16 to UTF-8 conversion ...

What support is there within the JDK for performing
character to byte conversion?
JDK1.4 has the java.nio.charset package but I can't
see an equivalent for JDK1.3 and lower, they seem to
use com.sun classes internally when charset conversion
is required.

If JDK1.4 is considered a sufficient base, I could
extend the current framework to provide conversion
engines that translate from one data representation
to another.  I could then create a new CodecEngine
interface to handle character buffers (eg.
CodecEngineChar).


> > > > 3. Customisable receivers.  All codecs utilise
> > > > receivers to
> > > > handle conversion results.  This allows
> > different
> > > > outputs such as
> > > > streams, in-memory buffers, etc to be supported.
> > > 
> > > And writers :) Velocity directives use them.
> > 
> > Do you mean java.io.Writer?  If so I haven't
> > included
> > direct support for them because I focused on raw
> > byte
> > streams.  However it shouldn't be hard to add a
> > receiver to write to java.io.Writer instances.
> 
> 
> My scenarios: 
> - I'm exporting information as base64 to XML with help
> ov velocity. I do it through custom directive - 
> in this directive I get a Writer from velocity, where
> I have to put my data. 
> 
> Ideally codec would do: read input stream - encode -
> put it into writer without allocating too much 
> memory. 
> 
> I'm importing information:
> - I have stream ( string ) of base 64 data - 
> codec gives me an input stream which is fed from this
> source and does not allocate too much memory and
> behaves polite...
> 
The current framework doesn't handle direct conversion
from an input stream to an output stream but this
would be simple to add if required.
Again, the hard part would be the char/byte issues.


---------------------------------------------------------------------
To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-dev-help@jakarta.apache.org


Mime
View raw message