xml-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jenny.bou...@nl.ibm.com
Subject RE: UTF-8/Latin-1 problem
Date Thu, 07 Jun 2001 11:13:28 GMT


I have the same problem with IE5, in my application I set the locale on AIX
to UTF-8 and produce an XML file containing some extended characters.
 Xerces has no problem with the file but IE5 doesn't recognise extended
characters even though I've given UTF-8 in the XML declaration.
                                            
 An Invalid character was found in text     
 content. Line 54, Position 23              
                                            




<ATBEZ LAISO="FR">Syst? 2628-41G</ATBEZ>

The line is <ATBEZ LAISO="FR">Système 2628-41G</ATBEZ>



Jenny



Britta Schüle <britta.schuele@ixos.de> on 06/06/2001 04:21:25 PM

Please respond to general@xml.apache.org

To:   "'general@xml.apache.org'" <general@xml.apache.org>
cc:   "'David_N_Bertoni@lotus.com'" <David_N_Bertoni@lotus.com>
Subject:  RE: UTF-8/Latin-1 problem




Well, it seems the problem is solved and it wasn't a problem with the
parser, but a mixture of configuration problems and a possible bug in the
Internet Explorer 5.0 (well, so who's surprised? ; )). So, big sorry for
causing a "panic" if I did!!!!
Cheers, Britta

-----Original Message-----
From: David_N_Bertoni@lotus.com [mailto:David_N_Bertoni@lotus.com]
Sent: Wednesday, June 06, 2001 4:08 PM
To: general@xml.apache.org
Subject: Re: UTF-8/Latin-1 problem



Are you saying that your document has an XML decl with the correct encoding
and the parser is not honoring the encoding?  That sounds like a huge bug,
and I can't believe it would not have been caught in testing.  If so, you
should file a bug in bugzilla and attach a sample document that reproduces
the problem.

If, on the other, you're saying your document is encoding in iso-8859-1,
but it there is no encoding in the XML decl, then you have a document that
is not well-formed.  There is no way for the parser to "autodetect"
iso-8859-1.  Indeed, it is required to assume utf-8 in the absence of an
explicit encoding.

By the way, this question is not appropriate for the general list.  You
should subscribe to the Xerces-J and post your parer-related questions
there.

Dave





                    Britta Schüle

                    <britta.schuel        To:     general@xml.apache.org

                    e@ixos.de>            cc:     (bcc: David N
Bertoni/CAM/Lotus)
                                          Subject:     UTF-8/Latin-1
problem

                    06/06/2001

                    04:10 AM

                    Please respond

                    to general








Hi,
I'm working on a project where xml's might have all sorts of encodings. The
parser deals with the UTF-8 stuff just fine, but when it gets a Latin-1
(iso-8859-1), it produces useless characters unless I set the encoding
explicitly.
Now I can't quite believe that the parser won't read the encoding from the
XML, so my question is, am I missing something? Is there a way to get the
parser to sort of "autodetect" an XML file's encoding?
I'm currently testing on the SAX2SAX sample in the Xalan-Java 2 download.
Thanks loads in advance,
Britta

---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org






---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org

---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org





---------------------------------------------------------------------
In case of troubles, e-mail:     webmaster@xml.apache.org
To unsubscribe, e-mail:          general-unsubscribe@xml.apache.org
For additional commands, e-mail: general-help@xml.apache.org


Mime
View raw message