httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Wheeler <da...@kineticode.com>
Subject Re: Apache::Request, APR::Table and UTF8
Date Tue, 05 Oct 2004 16:34:30 GMT
On Oct 5, 2004, at 9:20 AM, Joe Schaefer wrote:

> We could use three-bit field for marking the charset:
>
>   0 - unknown
>   1 - ASCII
>   2 - UTF-8
>   3 - UTF-16
>   [ room for 4 more iso? charsets ]

Note that data encoded in UTF-8 is not the same as decoded to Perl's 
internal utf8 format. The latter has the same bytes, but the "utf8" 
flag has been set on the variable so that Perl knows how to properly 
count characters, among other things. I suspect that it is the loss of 
the setting of this flag on the string variable that Boris is 
reporting.

Regards,

David


Mime
View raw message