perl-embperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerald Richter - ECOS <rich...@ecos.de>
Subject AW: Getting mad with UTF-8
Date Wed, 03 Jul 2013 15:47:43 GMT
Hi,

sorry for the late reply.

Perl utf8 flag does NOT says that your data is utf8 or not. It tell us something about the
internal representation of your data inside of Perl. So utf8 data can have the utf8 set, but
it need not, also everything is alright.

Unfortunately when I wrote the utf8 %fdat handling I was not fully aware of this fact.

It might help to access your %fdat data via

$data = Encode::decode_utf8 ($fdat{foo}) ;

Decode_utf8 will convert the utf8 data (that Embperl delivers) to the correct  internal representation.

I will fix this in a further release

Hope this helps

Gerald

> -----Urspr√ľngliche Nachricht-----
> Von: Jean-Christophe Boggio [mailto:embperl@thefreecat.org]
> Gesendet: Mittwoch, 12. Juni 2013 16:44
> An: embperl@perl.apache.org
> Betreff: Getting mad with UTF-8
> 
> Hello,
> 
> Can someone help me understand what could cause this :
> 
> warn "\$content : ".(utf8::is_utf8($content) ? "utf8" : "not utf8"); warn
> "\$ticketdata[0]->[0] : ".(utf8::is_utf8($ticketdata[0]->[0]) ? "utf8" : "not
> utf8"); warn "content4=$content"; if ($ticketdata[0]->[0] ne $content) {
> 	warn "content5=$content";
> 	#
> 	warn "content6=$content stored=".$ticketdata[0]->[0];
> 	warn "content7=$content";
> }
> 
> In apache2 error.log :
> 
> [Wed Jun 12 16:35:56 2013] [warn] [12504]ERR:  32:  Warning in Perl code:
> $content : not utf8 at /var/www/sites/recia/rtgi3/rtgilib.pm line 382,
> <GEN46> line 13.
> [Wed Jun 12 16:35:56 2013] [warn] [12504]ERR:  32:  Warning in Perl code:
> $ticketdata[0]->[0] : utf8 at /var/www/sites/recia/rtgi3/rtgilib.pm line 383,
> <GEN46> line 13.
> [Wed Jun 12 16:29:13 2013] [warn] [10974]ERR:  32:  Warning in Perl code:
> content4=h\xc3\xa9 at /var/www/sites/recia/rtgi3/rtgilib.pm line 381,
> <GEN47> line 13.
> [Wed Jun 12 16:29:13 2013] [warn] [10974]ERR:  32:  Warning in Perl code:
> content5=h\xc3\xa9 at /var/www/sites/recia/rtgi3/rtgilib.pm line 383,
> <GEN47> line 13.
> [Wed Jun 12 16:29:13 2013] [warn] [10974]ERR:  32:  Warning in Perl code:
> content6=h\xc3\x83\xc2\xa9 stored=h\xc3\xa9 at
> /var/www/sites/recia/rtgi3/rtgilib.pm line 385, <GEN47> line 13.
> [Wed Jun 12 16:29:13 2013] [warn] [10974]ERR:  32:  Warning in Perl code:
> content7=h\xc3\xa9 at /var/www/sites/recia/rtgi3/rtgilib.pm line 386,
> <GEN47> line 13.
> 
> As you see, the $content variable changes from one line to the other ?!?
> $ticketdata[0]->[0] contains "hé" coming from a DB (configured as UTF-8) and
> the test should not fail.
> 
> I guess the problem comes from the fact that on the same line I have one
> utf-8 variable and one non-utf8 one.
> 
> $content comes from $fdat{content} (not marked as utf8 while the page
> encoding is declared and recognized as utf-8).
> 
> What can I do to force embperl to always set the utf-8 flag on $fdat{...} ?
> 
> If you know a way of telling Apache/EmbPerl that no encoding other than
> UTF-8 exist in the world, I'll take it. And it's not a problem if I'm incompatible
> with anything.
> 
> Thanks for your help,
> 
> (using libembperl-perl 2.5.0~rc3-1 on Debian/wheezy with apache2-mpm-
> prefork 2.2.22-13)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
> For additional commands, e-mail: embperl-help@perl.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org


Mime
View raw message