harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Ellison <t.p.elli...@gmail.com>
Subject Re: [classlib][niochar] charset providers
Date Fri, 11 Jul 2008 14:14:30 GMT
Yep.

I think there are a few ways we can divide up the charsets, and have 
custom providers on different platforms.

I've tweaked the nio_char module to produce two JAR files, but upon 
reflection think it would be preferable to make them separate modules. 
So unless anybody objects I'll create modules\nio_char_add\*

Regards,
Tim

Nathan Beyer wrote:
> There are probably a few others that we need to include in the 'standard'
> set that are relatively de facto. Off the top of my head, I'd say we need to
> include Cp1252/Windows-1252, as that's the default Windows OS charset.
> 
> -Nathan
> 
> On Thu, Jul 10, 2008 at 4:38 AM, Tim Ellison <t.p.ellison@gmail.com> wrote:
> 
>> I've been playing with nio_char to see if I can easily reduce the footprint
>> of a Harmony install, and make the charset data more modular.
>>
>> At the moment, that module build into nio_char.jar (1.32Mb) and
>> hycharset.dll (1.51Mb), and uses the ICU charset code in
>> icu4j-charsets-3_8.jar (2.36Mb).
>>
>> These are so big because a number of the charset encoders/decoders are data
>> driven rather than algorithmic.
>>
>> Q1: Why do we only specify the ICU providers in the services file? Seems
>> like we are not even using the Harmony providers at all.
>>
>> So I updated that file to prefer the Harmony providers.
>>
>> The providers use heuristics about when it makes sense to go into the
>> native code to do the encode/decode.  So I've simply added a flag to see if
>> the natives are available, allowing me to remove the hycharset.dll and loose
>> no functionality when space is at a premium.
>>
>> Finally the Harmony charsets are split across standard and additional
>> providers, but they are all accessed from the same provider.
>>
>> Q2: What is the distinction used to classify some as standard and some as
>> additional?  The spec requires only a small subset of charsets supported as
>> standard [2].
>>
>> For now I just split the packaging into Harmony's standard/additional
>> charsets.
>>
>>
>> So I've ended up with:
>>  nio_char.jar (155kb)
>>  nio_char_add.jar (1.23Mb) optional
>>  hycharset.dll (1.51Mb)  optional
>>  icu4j-charsets-3_8.jar (2.36Mb) optional
>>
>> Like I say, I'd have to restructure Harmony's definition of standard
>> charsets to comply with the spec since we have far more in there than the
>> spec requires.
>>
>> [1]
>> http://svn.apache.org/viewvc/harmony/enhanced/classlib/trunk/modules/nio_char/src/main/java/org/apache/harmony/niochar/java.nio.charset.spi.CharsetProvider?view=markup
>> [2] http://java.sun.com/j2se/1.5.0/docs/api/java/nio/charset/Charset.html
>>
>> Regards,
>> Tim
>>
> 

Mime
View raw message