xerces-p-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From thomas Armstrong <tarmstr...@gmail.com>
Subject XML with Spanish characters
Date Sun, 21 Nov 2004 23:08:01 GMT
Hello.

I'm using 'Xerces-P' to parse a XML file with Spanish
characters. I would like to store data into a mySQL DataBase.

But I'm experiencing some problems, because if I browse my
DataBase, I get 'año' instead of 'año'.

In order to convert XML data from UTF-8 to Latin1, I tried these four
functions (suggested on Perl-XML FAQ):

----------------------//--------------------------
sub convert
{
 #WITH ...  use Unicode::String;
 $string = Unicode::String::utf8($_[0])->latin1();
 return $string;
}

sub convert
{
 #WITH ...   use utf8;
 $string = $_[0];
 $string = pack("C*", unpack('U*', $string));
 return $string; 
}

sub convert
{
 #WITH ...   use Text::Iconv;
 $string = $_[0];
 $converter = Text::Iconv->new('UTF-8', 'ISO8859-1');
 $string = $converter->convert($string); 
 return $string;
}

sub convert
{
   $string = $_[0];
   $string =~ tr/\x91\x92\x93\x94\x96\x97/''""\-\-/;
   $string =~ s/\x85/.../sg;
   $string =~ tr/[\x80-\x9F]//d;
   return($string);
}
----------------------//--------------------------

In all cases, I parse XML file, I execute 'convert($data)' and
I store it into a mySQL database.

I have no problems to display converted data (as well on the shell
as on a web page), but when I browse the data stored within 
the mySQL database, I get wrong words ('año' instead of 'año').

I checked that if I do not perform any 'convert($data)', I get
the same result as if I perform it (right data on the shell, wrong
into my DB).

I do not know wether it's a mySQL problem or a Perl+XML problem.

Any suggestion? Thank you very much.

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-p-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-p-dev-help@xml.apache.org


Mime
View raw message