httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bert Huijben" <b...@qqmail.nl>
Subject RE: apr_token_* conclusions (was: Better casecmpstr[n]?)
Date Wed, 25 Nov 2015 23:27:56 GMT
See http://www.siao2.com/2004/12/03/274288.aspx

And http://www.siao2.com/2013/04/04/10407543.aspx

For some background and related bugs in several products.

 

I hope this blog will stay alive. (The author passed away recently)

 

                Bert

 

From: Bert Huijben [mailto:bert@qqmail.nl] 
Sent: donderdag 26 november 2015 00:22
To: dev@httpd.apache.org
Subject: RE: apr_token_* conclusions (was: Better casecmpstr[n]?)

 

The example was the other way around. Changing SS to ß is not a valid transform, but the
other way is. There are also transforms on the combined AE characters, etc.

 

That Turkish ‘I’ problem is the only case I know of where the collation actually changes
behavior within the usual western alphabet of ASCII characters.

 

                Bert

 

 

From: Mikhail T. [mailto:mi+thun@aldan.algebra.com] 
Sent: woensdag 25 november 2015 23:19
To: dev@httpd.apache.org <mailto:dev@httpd.apache.org> 
Subject: Re: apr_token_* conclusions (was: Better casecmpstr[n]?)

 

On 25.11.2015 14:10, Mikhail T. wrote:

Two variables, LC_CTYPE and LC_COLLATE control this text processing behavior.  The above is
the correct lower case transliteration for Turkish.  In German, the upper case correspondence
of sharp-S ß is 'SS', but multi-char translation is not provided by the simple tolower/toupper
functions.

So, the concern is, some hypothetical header, such as X-ASSIGN-TO may, after going through
the locale-aware strtolower() unexpectedly become x-aßign-to?

I just tested the above on both FreeBSD and Linux, and the results are encouraging:

% echo STRASSE | env LANG=de_DE.ISO8859 tr '[[:upper:]]' '[[:lower:]]'
strasse

Thus, I contend, using C-library will not cause invalid results, and the only reason to have
Apache's own implementation is performance, but not correctness.

-mi


Mime
View raw message