httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Wheeler <>
Subject Re: Apache::Request, APR::Table and UTF8
Date Tue, 05 Oct 2004 23:21:25 GMT
On Oct 5, 2004, at 3:56 PM, Stas Bekman wrote:

> So once these flags are there, it's a total user's responsibility to 
> do the right thing, correct?. I was just wondering whether we should 
> do that in the APR::Table instead, if it's going to serve other 
> purposes. Of course if this is only really important for 
> Apache::Request, then that's probably the best place to do that.

I think that Boris wants it to not be the user's responsibility, and 
I'm inclined to agree with him. (Holler if I'm misrepresenting your 
position, Boris.) For the same reason that when I store a reference I 
don't want to get back a string such as "HASH(0x800368)", when I store 
a string that's got the utf8 flag enabled, I want to get it back that 
way. And if I have an array of strings, some with the flag set and some 
without, I don't know which is which and I don't want to have to think 
about which is which.

> Of course users can store the string with the flag by themselves, by 
> simply storing: "$flag" . "string" and get it back as they have stored 
> it.

The flag is not a Perl scalar. It's an attribute of the C struct for 
the scalar variable (sorry if I'm mangling the terminology here, I'm 
not a C hacker). So it would be very inefficient to have to store all 
my strings in a table like this

# Store
while (my ($k, $v) = each %stringmap) {
     $r->pnotes( $k => $v );
     $r->pnotes( "$k.utf8" => Encode::is_utf8($v);

# Fetch:
my %stringmap;
while (my ($k, $v) = each %{ {$r->pnotes }}) {
     next if $k =~ /\.utf8$/;
     Encode::_utf8_on($v) if $r->pnotes("$k.utf");
     $stringmap{$k} = $v;

This is an extreme example, but I think that it makes the case. apreq 
has to be able to know what kind of scalar it has, anyway, so it should 
be able to know if the string is decoded to Perl's internal utf8 
representation, and if so, make sure it keeps the utf8 flag on.



View raw message