harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: [classlib][luni] String.toLowerCase/toUpperCase incorrect for supplementary characters (HARMONY-6649)
Date Thu, 23 Sep 2010 11:21:04 GMT
On Wed, Sep 22, 2010 at 10:33 PM, Tim Ellison <t.p.ellison@gmail.com> wrote:

> On 23/Sep/2010 01:10, Robert Muir (JIRA) wrote:
> > I thought about this too,
> >
> > one concern (not knowing if there are more cases involved) would be
> > if the input "should" be ascii, but "could" be something else. if
> > String.toLowerCase had the ascii special-case with a fallback to ICU,
> > it could fail less gracefully in such a situation if it encountered
> > non-ascii rather than simply not matching, especially since unit
> > tests tend to have more coverage for the ascii case...
> >
> > ...but this might be theoretical
> Fail less gracefully than what?  Today, by using String#toLowerCase(),
> invalid ascii gets past into ICU so will get converted as though it were
> a valid char encoding, so I don't think it would make anything worse
> than it is today.

well, what I meant to say is that the auto-detect idea seems a bit shaky. if
something wants to do an ascii-only uppercase/lowercase before ICU is
available, and we know we cannot load ICU yet, then I think the
toASCIILowerCase is much better than calling String.toLowerCase and saying
"yeah we know the input is all ascii, it won't load ICU".

The toASCIILowerCase will never load ICU, doesn't depend on an
implementation detail of String, and then its explicit in the code what is
going on.

> I the the debate is whether to find and fix places in the class library
> code where we know the input is ascii and change uses of
> String#toLowercase to use
> org.apache.harmony.luni.util.Util#toASCIILowerCase() [1]

+1, I think this is the best solution.

Robert Muir

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message