httpd-apreq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Boris Zentner <>
Subject Re: Apache::Request, APR::Table and UTF8
Date Wed, 06 Oct 2004 00:13:07 GMT

Am 06.10.2004 um 01:55 schrieb Stas Bekman:

> David Wheeler wrote:
>> On Oct 5, 2004, at 3:56 PM, Stas Bekman wrote:
>>> So once these flags are there, it's a total user's responsibility to 
>>> do the right thing, correct?. I was just wondering whether we should 
>>> do that in the APR::Table instead, if it's going to serve other 
>>> purposes. Of course if this is only really important for 
>>> Apache::Request, then that's probably the best place to do that.
>> I think that Boris wants it to not be the user's responsibility, and 
>> I'm inclined to agree with him. (Holler if I'm misrepresenting your 
>> position, Boris.) For the same reason that when I store a reference I 
>> don't want to get back a string such as "HASH(0x800368)", when I 
>> store a string that's got the utf8 flag enabled, I want to get it 
>> back that way. And if I have an array of strings, some with the flag 
>> set and some without, I don't know which is which and I don't want to 
>> have to think about which is which.
> What you and Boris say makes perfect sense, but we certainly don't 
> want to try to support all possible formats out there and try to 
> handle those on behalf of users. If we do we may need to spawn 
> APR::Table to be a project on its own with its own dedicated 
> developers team. this is something suitable for Apache::Request which 
> deals with a sub-set of formats, it's not suitable for APR::Table 
> which is a general purpose thing.
> I've proposed to Boris to try to decode the utf8 variable back into 
> the bytecodes string, before storing it in the table, but he doesn't 
> want that, since it adds an unwanted overhead.

decode is not enough, I do not know when to decode. And if I store 
somewhere when I have to decode or copy the data to a CGI param object 
as I do currently, I lose a *the* big advantage of mod_perl for me, its 

So if my application use libapreq to get the data and then copy it to a 
CGI object why not using CGI in the first place?

> I don't know if it's a good idea to have APR::Table to deal only with 
> utf8 as the only special case. Joe?
>>> Of course users can store the string with the flag by themselves, by 
>>> simply storing: "$flag" . "string" and get it back as they have 
>>> stored it.
>> The flag is not a Perl scalar. It's an attribute of the C struct for 
>> the scalar variable (sorry if I'm mangling the terminology here, I'm 
>> not a C hacker). So it would be very inefficient to have to store all 
>> my strings in a table like this
> sure, but you can make it a perl scalar.
> string "foo" on the way to the table:
> utf8     => "1foo";
> non-utf8 => "0foo"

This is one quick idea, that I considered already. But CGI handles the 
utf8 flag already correct.

> on the way out, split and check the first char, and set the flag 
> accordingly. Won't that work?
>> # Store
>> while (my ($k, $v) = each %stringmap) {
>>     $r->pnotes( $k => $v );
>>     $r->pnotes( "$k.utf8" => Encode::is_utf8($v);
>> }
>> # Fetch:
>> my %stringmap;
>> while (my ($k, $v) = each %{ {$r->pnotes }}) {
>>     next if $k =~ /\.utf8$/;
>>     Encode::_utf8_on($v) if $r->pnotes("$k.utf");
>>     $stringmap{$k} = $v;
>> }
>> This is an extreme example, but I think that it makes the case. apreq 
>> has to be able to know what kind of scalar it has, anyway, so it 
>> should be able to know if the string is decoded to Perl's internal 
>> utf8 representation, and if so, make sure it keeps the utf8 flag on.
> You certainly have no need to do that since pnotes already preserve 
> the perl flags.

But I have to copy the data to a new object, and I have to rewrite all 
code, that use $t->set $t->get $t->do and so on.
What David shows is only one way to preserve the flag.


View raw message