perl-asp mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Chamas <j...@chamas.com>
Subject Re: Output character encoding
Date Tue, 05 Jun 2012 23:30:41 GMT
On 6/5/12 2:02 AM, Arnon Weinberg wrote:
>
> How can I set the output character encoding of Apache::ASP output?
> ...

Hi Arnon, All,

I have gone over the thread and been stumped on this for a while.  Bottom line 
it looks like Apache::ASP does not play well with Encode, and this seems to me 
to be around the PerlIO interactions and something not quite connecting right on 
a tied file handle.  But I do know know the answer to solve this. :(

To explain where there is some magic at play:

Apache::ASP::Response does a "use bytes" which is to deal with the output stream 
correctly I believe this is around content length calculations.  I think this is 
fine here, and turning this off makes things worse for these examples.

Apache::ASP::Response is more importantly tied as a file handle when this code 
is run:

         tie *RESPONSE, 'Apache::ASP::Response', $self->{Response};
         select(RESPONSE);

This is to allow for print to go to $Response->PRINT which aliases to 
$Response->Write. Fundamentally all output is going through $Response->Write at 
the end of the day including the script static content itself.

What I have found is that this will output the correct bytes in this Apache::ASP 
script:

<% print STDOUT Encode::decode('ISO-8859-1',"\xE2"); %>

as it bypasses the tied file handle layer to $Response, so we know perl is 
working at this point!

but doing this is where we have a problem:

<% print Encode::decode('ISO-8859-1',"\xE2"); %>

and immediately in the Apache::ASP::Response::Write() method the data has 
already been converted incorrectly without any processing occurring.  Its as if 
by merely going through the tied interface that data goes through some 
conversion process.  I have played with various IO settings as in "open ..." and 
various "use" pragmas to no avail but really shooting blind here on what could 
not be working.

So the way I see it..

Encoding Magic
File handle tie Magic  <--- data conversion
Data to $Response->Write

Encode and perltie seem to have some conflicting bits here.

If there were some workaround here I would be glad to hear it but I seem to have 
exhausted my ability to troubleshoot this.

Regards,

Josh



> # Latin-1.rasp: #############
>
> <%
> #use open ( ":utf8", ":std" );
> #binmode ( STDOUT, ":encoding(ISO-8859-1)" );
>
> $::Response->{Charset} = "ISO-8859-1";
>
> use Encode;
>
> print Encode::decode('ISO-8859-1',"\xE2"),
> Encode::decode('UTF-8',Encode::encode('UTF-8',"\xE2")),
> "\x{00E2}",
> chr(0x00E2);
> %>
>
> #############################
>
>>asp-perl Latin-1.rasp
> Content-Type: text/html; charset=ISO-8859-1
> Content-Length: 6
> Cache-Control: private
>
> ââââ
>>asp-perl Latin-1.rasp | tail -1 | hexdump
> 0000000 a2c3 a2c3 e2e2
> 0000006
>
> For some reason, the first 2 test characters are UTF-8 encoded, and the last 2
> are ISO-8859-1 encoded.
> How can I get the same results as the CGI script above?
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: asp-unsubscribe@perl.apache.org
For additional commands, e-mail: asp-help@perl.apache.org


Mime
View raw message