perl-embperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Christophe Boggio <>
Subject Re: Problem with file upload
Date Tue, 30 Sep 2008 23:59:30 GMT
Ben Hiebert wrote :
> Perl usually 
> tries to guess at the best encoding when it takes in the data and then 
> encodes it internally as best it can.  You may have a problem where the 
> data comes in as ISO88591 but perl thinks it is UTF8 data, encodes it 
> internally as UTF8 and then prints out the UTF8-as-ISO88591 to give you 
> the bad results.  

Yes, that is my guess too.

> It may be worth checking to see what format Perl thinks your incoming 
> data is by using
> $flag = utf8::is_utf8(STRING);

Good idea. I modified the code to this :

while (read($fdat{efilename},$buffer,32768)) {
	if (utf8::is_utf8($buffer)) {
		print OUT "u";
	print FILE $buffer;

...but in both cases (working and not) I never get the "uuuuu" lines.
BUT when the $buffer is written to disk it is transformed ! I tried
with binmode FILE just after opening the file for output but same
things happen.

> If perl thinks UTF8 then it is misintepreting your incoming data and 
> you'll need to either decode it with decode or with one of the other 
> UTF8 utilities.  This may work:
> $GoodInternalString = decode("iso-8859-1", $IncomingData);

That's what I use when the file *is* iso-8859-1.

> These are the pages I read over and over and over again until my pages 
> magically work:

:-) I see *exactly* what you mean. I've read these pages over and over too.

I don't get the reason for that random behaviour.



To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message