harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vladimir Strigun" <vstri...@gmail.com>
Subject Re: [contribution] Contribution of charset encoders/decoders for NIO_CHAR module
Date Tue, 10 Apr 2007 05:23:38 GMT
On 4/10/07, Yang Paulex <paulex.yang@gmail.com> wrote:
> 2007/4/9, Vladimir Strigun <vstrigun@gmail.com>:
> >
> > On 4/9/07, Yang Paulex <paulex.yang@gmail.com> wrote:
> > > 2007/4/9, Vladimir Strigun <vstrigun@gmail.com>:
> > > >
> > > > Hi all!
> > > >
> > > > I'm happy to announce one more contribution to harmony on behalf of
> > > > Intel. Provided implementation of charset encoders/decoders is
> > > > intended to replace the ICU-based charsets encoding/decoding
> > > > operations. The code was developed in clean-room environment inside
> > > > Intel and I'd like you to play with it and include to current Harmony
> > > > tree.
> > > >
> > > > The package could be found there:
> > > > HARMONY-3593
> > > >
> > > > The algorithms for charsets encoding/decoding differs from that of
> > > > ICU, all charsets are generated from current Harmony or any other
> > > > implementation of Java and could be properly integrated into current
> > > > nio_char module. The archive contains source files for 6 charsets:
> > > > GB18030, US-ASCII, ISO-8859-1, UTF-8, UTF-16, UTF-16BE, UTF-16LE;
> > > > implementation of CharsetProvider; generator for other Charsets and
> > > > native part. I've tested the package with more that 90 charsets, and
> > > > all benchmarks and tests passed with new bundle. Additionally I have
> > > > significant boost for Dacapo.antlr and Dacapo.xalan benchmarks with
> > > > current Harmony tree on DRLVM and IBM VM. On DRLVM I have 2.5x boost
> > > > for antlr and ~5-8x for xalan.
> > > >
> > > > The main advantages of the package are the following:
> > > >   - Code for every charset is generated by CharsetGenerator, thus, if
> > > > some modification would be necessary we need just correct generator
> > > > and re-generate all sources.
> > > >   - We use 2 different encoders and decoders for java and direct
> > > > buffers. Since most applications use java heap buffers, unlike
> > > > existing implementation it doesn't produce lots of native calls to
> > > > perform encoding/decoding operations on the java buffers those
> > > > significantly improving performance. This is the main reason why we
> > > > have such a significant boost for Dacapo.
> > > >   - Charset tables for encoding/decoding are stored in appropriate
> > > > classes.
> > > >
> > > > Since the package contains implementation for 6 charsets only,
> > > > documentations how to generate and build additional charsets you could
> > > > find in README file from contributed package.
> > > >
> > > > Please do not hesitate to contact me for more details.
> > > >
> > > > Thanks,
> > > > Vladimir.
> > > >
> > >
> > > Good work, Vladimir and team in Intel!
> > >
> > > I'm also interested in a pure Java charset conversion provider for
> > Harmony,
> > > because the frequent JNI invocation in ICU4JNI(current Harmony charset
> > > provider) may impair the performance when dealing with small chunk of
> > bytes.
> > > But I noticed that, in this contribution, US_ASCII, ISO_8859_1 and
> > GB18030
> > > are implemented in native C, just out of interest, any special reason
> > not to
> > > implemented in Java?
> >
> > As I wrote ealier, 2 branches of code generated for every
> > encoder/decoder: java and native one. Native branch used only for
> > processing native byte buffers. Native branch could be easily removed
> > by small modification of generators, but performance measurements
> > shows that it's better to use native decoders/encoders
> > in case of native buffers.
>
>
> So there may be two implementations(one native, one java) for  one
> charsets?

exactly

> Thanks.
> > Vladimir.
> >
> > > --
> > > Paulex Yang
> > > China Software Development laboratory
> > > IBM
> > >
> >
>
>
>
> --
> Paulex Yang
> China Software Development laboratory
> IBM
>

Mime
View raw message