httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stas Bekman <>
Subject Re: Apache::Request, APR::Table and UTF8
Date Tue, 05 Oct 2004 23:55:25 GMT
David Wheeler wrote:
> On Oct 5, 2004, at 3:56 PM, Stas Bekman wrote:
>> So once these flags are there, it's a total user's responsibility to 
>> do the right thing, correct?. I was just wondering whether we should 
>> do that in the APR::Table instead, if it's going to serve other 
>> purposes. Of course if this is only really important for 
>> Apache::Request, then that's probably the best place to do that.
> I think that Boris wants it to not be the user's responsibility, and I'm 
> inclined to agree with him. (Holler if I'm misrepresenting your 
> position, Boris.) For the same reason that when I store a reference I 
> don't want to get back a string such as "HASH(0x800368)", when I store a 
> string that's got the utf8 flag enabled, I want to get it back that way. 
> And if I have an array of strings, some with the flag set and some 
> without, I don't know which is which and I don't want to have to think 
> about which is which.

What you and Boris say makes perfect sense, but we certainly don't want to 
try to support all possible formats out there and try to handle those on 
behalf of users. If we do we may need to spawn APR::Table to be a project 
on its own with its own dedicated developers team. this is something 
suitable for Apache::Request which deals with a sub-set of formats, it's 
not suitable for APR::Table which is a general purpose thing.

I've proposed to Boris to try to decode the utf8 variable back into the 
bytecodes string, before storing it in the table, but he doesn't want 
that, since it adds an unwanted overhead.

I don't know if it's a good idea to have APR::Table to deal only with utf8 
as the only special case. Joe?

>> Of course users can store the string with the flag by themselves, by 
>> simply storing: "$flag" . "string" and get it back as they have stored 
>> it.
> The flag is not a Perl scalar. It's an attribute of the C struct for the 
> scalar variable (sorry if I'm mangling the terminology here, I'm not a C 
> hacker). So it would be very inefficient to have to store all my strings 
> in a table like this

sure, but you can make it a perl scalar.

string "foo" on the way to the table:

utf8     => "1foo";
non-utf8 => "0foo"

on the way out, split and check the first char, and set the flag 
accordingly. Won't that work?

> # Store
> while (my ($k, $v) = each %stringmap) {
>     $r->pnotes( $k => $v );
>     $r->pnotes( "$k.utf8" => Encode::is_utf8($v);
> }
> # Fetch:
> my %stringmap;
> while (my ($k, $v) = each %{ {$r->pnotes }}) {
>     next if $k =~ /\.utf8$/;
>     Encode::_utf8_on($v) if $r->pnotes("$k.utf");
>     $stringmap{$k} = $v;
> }
> This is an extreme example, but I think that it makes the case. apreq 
> has to be able to know what kind of scalar it has, anyway, so it should 
> be able to know if the string is decoded to Perl's internal utf8 
> representation, and if so, make sure it keeps the utf8 flag on.

You certainly have no need to do that since pnotes already preserve the 
perl flags.

Stas Bekman            JAm_pH ------> Just Another mod_perl Hacker     mod_perl Guide --->

View raw message