harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alexei Zakharov" <alexei.zakha...@gmail.com>
Subject Re: [classlib][icu] Bringing ICU level up to 3.8
Date Tue, 16 Oct 2007 17:34:34 GMT
Hi Oliver,

I've created a small benchmark too. It takes Leo Tolstoy's "War and
Peace" Book One as input and converts it from Russian CP-1251 to
UTF-16 (10 times) and back (also 10 times). You may find the
benchmark's source code and a build file at [1].  The first difference
from your benchmark is the language & encoding - Russian in my case.
The second difference is the set of tested VMs - I've run the
benchmark on RI, J9 and DLRVM.

You may find results below. BTW the results shows that in this
particular test our internal providers (from
org.apache.harmony.niochar.charset package) are faster than both
versions of ICU. Another interesting fact is terrible ICU performance
on DLRVM. However, on J9 it works rather fast. And this is something
that should be fixed IMO (bad performance on DRLVM I mean). And
finally, yes, ICU4JNI is a little bit faster than ICU4J in this test.
However, "War and Peace" is a rather big book (paper version of the
first part contains about 400 pages, if repeated 10 times = 4000
pages), but difference in numbers is not so big.

[1] http://people.apache.org/~ayza/icu_experiments/

<sun.nio.cs.MS1251$Decoder> Decoding time: 571 millis
<sun.nio.cs.MS1251$Encoder> Encoding time: 351 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 430 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 551 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 401 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 540 millis

<org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 231 millis
<org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 430 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 781 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 620 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 561 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 371 millis

<org.apache.harmony.niochar.charset.CP_1251$Decoder> Decoding time: 351 millis
<org.apache.harmony.niochar.charset.CP_1251$Encoder> Encoding time: 540 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 6660 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 1071 millis

<com.ibm.icu.charset.CharsetMBCS$CharsetDecoderMBCS> Decoding time: 6179 millis
<com.ibm.icu.charset.CharsetMBCS$CharsetEncoderMBCS> Encoding time: 451 millis

With Best Regards,

2007/10/11, Oliver Deakin <oliver.deakin@googlemail.com>:
> Tony Wu wrote:
> > On 10/8/07, Oliver Deakin <oliver.deakin@googlemail.com> wrote:
> >> Are there any particular
> >> benchmarks you had in mind for this?
> >>
> >>
> > ya, there is a micro benchmark on HARMONY-3709
> >
> >
> <SNIP!>
> I have run the micro benchmark on Harmony with it's current ICU
> configuration (icu4jni 3.4.4) and on Harmony with pure icu4j 3.8. The
> results are pretty much as expected - for small jobs icu4j is
> significantly faster, for large jobs icu4jni comes out on top (full
> results at the end of this email). It seems that performance-wise there
> are benefits on both sides depending on the work we are doing.
> Regards,
> Oliver

View raw message