commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <Jan.Van-Sta...@ec.europa.eu>
Subject RE: [Urgent] UTF-8 encoding problem
Date Thu, 04 Jan 2007 12:42:33 GMT
Sound more like an encoding problem to me;

Try submitting an xml-file with "simple" (<127) characters; these are encoded the same
way in other encoding schemes (like windows 1252) and utf8; if this works, I would think that
the submitted xml-file is not correctly utf8 encoded; as the xml-header defines that the xml
is UTF8 encoded, characters like é, è, or the euro-sign will be encoded differently.

Jan

------------------
Jan Van Stalle
DIGIT.B.03                 
tel +32 2 299 49 82
Bureau MO34 2/54


-----Original Message-----
From: Mark Diggory [mailto:mdiggory@gmail.com] 
Sent: Thursday, January 04, 2007 1:08 PM
To: Jakarta Commons Users List
Subject: Re: [Urgent] UTF-8 encoding problem


This looks more like an XML / Xerces Parsing issue, I would seek help there.
Sounds like your placing non-UTF encoded chars into your XML file.

-Mark

On 12/28/06, DECAFFMEYER MATHIEU <MATHIEU.DECAFFMAYER@fortis.lu> wrote:
>
>
> Hi,
>
> I am using Jakarta Configuration to manipulate some XML files.
> I have the following error when I open one of the files :
>
> org.apache.commons.configuration.ConfigurationException: Octet 2 incorrect
> dans la séquence UTF-8 à 3-octets.
>         at org.apache.commons.configuration.XMLConfiguration.load(
> XMLConfiguration.java:620)
>         at org.apache.commons.configuration.XMLConfiguration.load(
> XMLConfiguration.java:578)
>         at
> org.apache.commons.configuration.XMLConfiguration$XMLFileConfigurationDelegate.load
> (XMLConfiguration.java:1045)
>
>         at org.apache.commons.configuration.AbstractFileConfiguration.load
> (AbstractFileConfiguration.java:280)
> [...]
>
>
>
> Caused by: java.io.UTFDataFormatException: Octet 2 incorrect dans la
> séquence UTF-8 à 3-octets.
>         at org.apache.xerces.impl.io.UTF8Reader.invalidByte(Unknown
> Source)
>         at org.apache.xerces.impl.io.UTF8Reader.read(Unknown Source)
>         at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
> [...]
>
>
> The headlines of the file is :
>
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE configuration [
> <!ENTITY amp "&#x26;">
> <!ENTITY lt "&#x3C;">
> <!ENTITY minus "&#45;">
> ]>
> [...]
>
> I have an XML with exactly the same lines at the top,
> and I have no problem with this one :
> <?xml version="1.0" encoding="UTF-8"?>
> <!DOCTYPE configuration [
> <!ENTITY amp "&#x26;">
> <!ENTITY lt "&#x3C;">
> <!ENTITY minus "&#45;">
> ]>
> [...]
>
> What do u suggest me to do ?
>
> Thank u for any help ! Will be greatly appreciated !
>
>
> ============================================
> Internet communications are not secure and therefore Fortis Banque
> Luxembourg S.A. does not accept legal responsibility for the contents of
> this message. The information contained in this e-mail is confidential and
> may be legally privileged. It is intended solely for the addressee. If you
> are not the intended recipient, any disclosure, copying, distribution or any
> action taken or omitted to be taken in reliance on it, is prohibited and may
> be unlawful. Nothing in the message is capable or intended to create any
> legally binding obligations on either party and it is not intended to
> provide legal advice.
> ============================================
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: commons-user-help@jakarta.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: commons-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: commons-user-help@jakarta.apache.org


Mime
View raw message