httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Zentner <...@2bz.de>
Subject Re: Apache::Request, APR::Table and UTF8
Date Tue, 05 Oct 2004 22:32:43 GMT

HI,

Am 05.10.2004 um 19:48 schrieb Geoffrey Young:

>
>> We store params in a C struct that's
>> currently devoid of any charset information.  Boris wants to mark
>> certain params as being utf8, so when a later Apache::Request module
>> fetches them, they'll be returned from param() as utf8 strings instead
>> of byte strings (iow, param will set the "utf8" flag for them).
>
> not to be the only one in the class that isn't able to follow along, 
> but
> does calling Encode::_utf8_on() from userland not work?
>

No, it works, but only if _you_ know that this param is encoded in 
utf8. If you do not know it perl act wrong and convert it again to utf8 
that result in a string that is not ascii, not iso-something and not 
utf just wrong data. Your 'solution' implies that you write the part 
that insert and work with the data. It will never work with someone 
other module, since he can not know.

It is really *no* userspace problem.

If your source use Encode::_utf8_on it is the same as setting the utf8 
flag to always on. And thats wrong. Also it encodes a string for every 
insert. It is allowed to insert data into apr with $apr->param( text => 
$var ); if var is already in utf8 I get it wrong back.

> I guess I'm still trying to reconcile how this is not a user-space 
> problem -
> I would think that the various functions in Encode.pm and utf8.pm 
> would be
> sufficient most of this stuff.
>
> --Geoff
>
>
--
Boris


Mime
View raw message