perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Jacobson <chris.jacob...@online-rewards.com>
Subject Re: UTF8 fun with SOAP::Lite and mod_perl 1.3.33
Date Fri, 16 Mar 2007 21:27:18 GMT
FWIW, if you tell the client to render the page as UTF-8, your 'broken' 
mod_perl version works correctly.  The content-type header is 
instructing the client to render the page using ISO-8859-1, which will 
result in gremlin characters being rendered.

Aaron Hawryluk wrote:
> This is suspiciously similar to the problem I had with double-byte characters coming
up where single-byte characters were expected.  If you find the answer to this, could you
let me know?  I still can't migrate to mod_perl due to the problem. Mind you I'm on Apache2/mp2
so they could be completely unrelated...
> 
> Here's a sample of what happens:
> 
> Here it is under my old CGI model (which is now far too CPU-intensive):
> http://www.calgarysun.com/cgi-bin/publish.cgi?p=171082&x=articles&s=showbiz
> 
> And here it is under mod_perl:
> http://www.calgarysun.com/perl-bin/publish.cgi?p=171082&x=articles&s=showbiz
> 
> Hey! Mod_perl guys! Can you say "reproducibility"?
> 
> 
> --Aaron Hawryluk
> Webmaster, The Calgary Sun
> http://www.calgarysun.com
> webmaster@calgarysun.com
> Ph: 403-250-4371
> 
> 
> -----Original Message-----
> From: Drew Wilson [mailto:amw@apple.com] 
> Sent: March-16-07 1:15 PM
> To: modperl mod_perl
> Subject: UTF8 fun with SOAP::Lite and mod_perl 1.3.33
> 
> I'm trying to track down a Unicode malcoding problem using SOAP::Lite  
> 0.67 with mod_perl 1.29 on apache 1.3.33.
> 
> The problem I'm seeing is my UTF8 strings are transformed in the http  
> response.
> 
> The strings look correct inside the perl space (e.g. printing to  
> STDERR inside the perl handler) but the strings are converted in the  
> http packet returned (captured using tcpdump).
> 
> For example, if I want to send back a string containing the Unicode  
> snowman U2603 (UTF8 E2 98 83), I manually encode the string as:
>             my $snowman = '☃';
>             my %result = ( 'snowman' => SOAP::Data->type( string =>  
> $snowman  ) );
> 
> and return it
>             return %result;
> 
> When watching with tcpdump, I expect to see this UTF8 byte sequence:
> 	 e2 98 83
> but instead see
> 	c3 a2 c2 98 c2 83
> 
> I suspect the UTF8 byte sequence is being treated as a UTF 16  
> sequence [00 e2 00 98 00 83], which is then converted to the UTF8  
> equivalent byte sequence [c3 a2 c2 98 c2 83].
> 
> But I cannot figure out WHERE this conversion is being done.
> 
> Is there any way to trace data being written to the response?
> 
> BTW - the $snowman string returns 1 for utf8::is_utf8 and utf8::valid.
> 
> Thanks for any suggestions,
> 
> Drew
> 
> 
> 

-- 
____________________________________________________________________
Chris Jacobson                         Phone: (513) 665-9070 x310
Online-Rewards                         Fax  : (214) 242-4448
403 Vine Street, Second Floor          http://www.online-rewards.com
Cincinnati, OH 45202


Mime
View raw message