harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Ellison <t.p.elli...@gmail.com>
Subject Re: [classlib] String.toLowerCase/toUpperCase incorrect for supplementary characters (HARMONY-6649)
Date Thu, 16 Sep 2010 15:21:34 GMT
On 16/Sep/2010 16:04, Robert Muir wrote:
> On Thu, Sep 16, 2010 at 10:50 AM, Tim Ellison <t.p.ellison@gmail.com> wrote:
> 
>> The principle works ok.  I attached a patch on HARMONY-6649 to show that
>> making a local toUpperCase() method for the charset names solves the
>> circularity problem in bootstrapping, and does the right thing
>> irrespective of locale.
>>
>> There are some compatibility issues though, since on the RI
>>  String foo="foo";
>>  foo.toLowerCase() == foo
>>
>> but it doesn't if we simply use ICU's toLowerCase method :-(
>>
> 
> maybe we could ask ICU if they could fix this?

Yes, worth asking.

> i looked at their toLowerCase/toUpperCase methods and it wouldnt
> require too much hacking: their UCaseProps returns ~ch whenever ch
> 'folds to itself', so its easy to track if its 'unchanged' without
> doing a second pass/equals().

I'm deliberately not looking at the ICU impl, just so the work in
Harmony is all our own (avoid license confusion etc)

Since the uppercasing rules may depend upon context etc, I can only
imagine that we'll have to do the conversion via ICU then compare the
results to see if they are different.  It will require a pass through
the char array again though.

 public String toUpperCase(Locale locale) {
     String result = UCharacter.toUpperCase(locale, this);

     // Must return self if chars unchanged
     if (count != result.count) {
         return result;
     }
     for (int i = 0; i < count; i++) {
         if (value[offset + i] != result.value[result.offset + i]) {
             return result;
         }
     }
     return this;
 }


> they might be interested in improved compatibility also.

They may have compatibility issues the other way round, i.e. their users
expecting different objects back.

>> I also expect we need to fix String#equalsIgnoreCase() to do the right
>> thing...
>>
>
> yes, i noticed this too, we don't have to deal with Locale issues there, but
> we have to iterate codepoints / compare with Character.toLowerCase(int) and
> Character.toUpperCase(int)...

Ack

Regards,
Tim

Mime
View raw message