httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bert Huijben" <>
Subject RE: apr_token_* conclusions (was: Better casecmpstr[n]?)
Date Wed, 25 Nov 2015 23:27:56 GMT


For some background and related bugs in several products.


I hope this blog will stay alive. (The author passed away recently)




From: Bert Huijben [] 
Sent: donderdag 26 november 2015 00:22
Subject: RE: apr_token_* conclusions (was: Better casecmpstr[n]?)


The example was the other way around. Changing SS to ß is not a valid transform, but the
other way is. There are also transforms on the combined AE characters, etc.


That Turkish ‘I’ problem is the only case I know of where the collation actually changes
behavior within the usual western alphabet of ASCII characters.





From: Mikhail T. [] 
Sent: woensdag 25 november 2015 23:19
To: <> 
Subject: Re: apr_token_* conclusions (was: Better casecmpstr[n]?)


On 25.11.2015 14:10, Mikhail T. wrote:

Two variables, LC_CTYPE and LC_COLLATE control this text processing behavior.  The above is
the correct lower case transliteration for Turkish.  In German, the upper case correspondence
of sharp-S ß is 'SS', but multi-char translation is not provided by the simple tolower/toupper

So, the concern is, some hypothetical header, such as X-ASSIGN-TO may, after going through
the locale-aware strtolower() unexpectedly become x-aßign-to?

I just tested the above on both FreeBSD and Linux, and the results are encouraging:

% echo STRASSE | env LANG=de_DE.ISO8859 tr '[[:upper:]]' '[[:lower:]]'

Thus, I contend, using C-library will not cause invalid results, and the only reason to have
Apache's own implementation is performance, but not correctness.


View raw message