From "William A. Rowe Jr." <wr...@rowe-clan.net>
Subject Re: i18n codepage guidance needed
Date Tue, 12 Apr 2011 19:37:34 GMT
On 4/12/2011 11:56 AM, Jeff Trawick wrote:
> On Tue, Apr 12, 2011 at 12:29 PM, William A. Rowe Jr.
> <wrowe@rowe-clan.net> wrote:
>> I have one dev question for my apr_fnmatch() refactoring
>> Today we lowercase the two characters (and don't support case-insensitive
>> range matches at all, I won't change this apr-specific quirk).  But IIRC
>> there are language with multiple lower case representations of the same
>> upper case character, but never (or at least, rarely) visa versa?
>> Shouldn't we upcase both the text and match chars, instead, to better
>> support non-ASCII locales?  (Obviously, this ignores utf-8 issues, and
>> I'm not going to enable MBCS in this next release, but will at least make
>> it possible to enhance for MBCS later on, without changing fn prototypes).
> No real answer, just some comments...
> * FWLIW, it is tolower() now "just because."  It was originally toupper().
> * For interesting text, it could change behavior, and we don't have
> bugs filed now, right?
> * For interesting text, neither toupper() nor tolower() nor == is
> correct!  (So don't bother changing behavior.)

I think I found the answer to "just because", thanks Deutchlanders... from
the linux manpage...

  In some non-English locales, there are lowercase letters with no corre-
  sponding uppercase equivalent; the German sharp s is one example.

Still pondering.

