harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stefano Mazzocchi <stef...@apache.org>
Subject Re: [contribution] Contribution of charset encoders/decoders for NIO_CHAR module
Date Mon, 09 Apr 2007 16:19:38 GMT
Vladimir Strigun wrote:
> Hi all!
> I'm happy to announce one more contribution to harmony on behalf of
> Intel. Provided implementation of charset encoders/decoders is
> intended to replace the ICU-based charsets encoding/decoding
> operations. The code was developed in clean-room environment inside
> Intel and I'd like you to play with it and include to current Harmony
> tree.
> The package could be found there:
> HARMONY-3593
> The algorithms for charsets encoding/decoding differs from that of
> ICU, all charsets are generated from current Harmony or any other
> implementation of Java and could be properly integrated into current
> nio_char module. The archive contains source files for 6 charsets:
> GB18030, US-ASCII, ISO-8859-1, UTF-8, UTF-16, UTF-16BE, UTF-16LE;
> implementation of CharsetProvider; generator for other Charsets and
> native part. I've tested the package with more that 90 charsets, and
> all benchmarks and tests passed with new bundle. Additionally I have
> significant boost for Dacapo.antlr and Dacapo.xalan benchmarks with
> current Harmony tree on DRLVM and IBM VM. On DRLVM I have 2.5x boost
> for antlr and ~5-8x for xalan.
> The main advantages of the package are the following:
>  - Code for every charset is generated by CharsetGenerator, thus, if
> some modification would be necessary we need just correct generator
> and re-generate all sources.
>  - We use 2 different encoders and decoders for java and direct
> buffers. Since most applications use java heap buffers, unlike
> existing implementation it doesn't produce lots of native calls to
> perform encoding/decoding operations on the java buffers those
> significantly improving performance. This is the main reason why we
> have such a significant boost for Dacapo.

wow, this is huge! Is there any significant change in the speed of
string creation too?

>  - Charset tables for encoding/decoding are stored in appropriate classes.
> Since the package contains implementation for 6 charsets only,
> documentations how to generate and build additional charsets you could
> find in README file from contributed package.
> Please do not hesitate to contact me for more details.

Thanks for the awesome contribution.


View raw message