perl-asp mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fernando Munoz <>
Subject RE: UTF8 issue
Date Wed, 29 Jan 2003 18:57:12 GMT
Thanks Phillip, that solves the problem. I managed myself to find a less
elegant but, equally effective, solution. I operates over the string passing
the result to a second scalar that gets encoded as a string of bytes:

my ($description, $value) = split(":",$biblio[$n]);  <- These are UTF8
my $value = sprintf("%4.2f", $value); <- Here $value goes back to a string
of bytes
my $lstring = length($description);
my $newdesc = substr($description,0,$lstring); <- Here $newdesc has
$description as a string of bytes

After this the digests are all different and correct. It is not elegant but

Thanks again.

-----Original Message-----
From: Philip Mak []
Sent: Wednesday, January 29, 2003 10:07 AM
To: Fernando Munoz
Cc: ''
Subject: Re: UTF8 issue

I'm guessing you'll have to somehow "cast" the UTF8 strings so that
they're interpreted byte-by-byte, rather than character-by-character.

Maybe try "use utf8;" and then pass utf8::encode($str) instead of $str
to the MD5 function.

On Wed, Jan 29, 2003 at 09:50:13AM -0800, Fernando Munoz wrote:
> Well, there's no error logging that I can refer to, but when you try
> to hexdec these strings (the ones coming in UTF8) no matter how
> different the strings are, they always return the same digest.
> Searching around I find this note :
> "Perl 5.8 support Unicode characters in strings. Since the MD5
> algorithm is only defined for strings of bytes, it can not be used
> on strings that contains chars with ordinal number above 255. The
> MD5 functions and methods will croak if you try to feed them such
> input data:"   
> in the documentation for Digest::MD5
> ( 
Lions Gate Entertainment, Inc.  [ AMEX: lgf ] 
Five Proud Years, One Independent Spirit.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message