axis-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Serero" <michael.ser...@covad.net>
Subject RE: Character encoding
Date Tue, 27 Jan 2004 21:20:17 GMT
Nelson,

Thanks for your reply. Do you have any suggestion on how to convert CP1252
to UTF-8?

I have tried something along the following lines:

	String myString = "The CP1252 string";
 	Charset cs = Charset.forName("UTF-8");
      context.writeSafeString(new String(cs.encode(myString).array()));

But it did not work for me.

Also I am puzzled by the XMLUtils.encodeString() method.
If the string argument contains one of the character &, ", \, ', < or >,
all those characters plus any characters coded on more than one byte also
get escape.

The substitution does not take place if the "magic" characters are not in
the string (?). In other words the non US-ASCII characters get encoded
differently based on whether other characters in the string need encoding.

Michael


-----Original Message-----
From: Nelson Minar [mailto:nelson@monkey.org]
Sent: Tuesday, January 20, 2004 4:16 PM
To: axis-user@ws.apache.org
Subject: Re: Character encoding


>When I send a SOAP request to my server if one of the String contains a
>smart quotes (“) the server generates the following parsing error:
>org.xml.sax.SAXParseException: Character conversion error: "Unconvertible
>UTF-8 character beginning with 0x93" (line number may be too low).

I'm not positive, but I suspect the problem is that 0x93 is not a
valid way to encode a quotation mark in UTF-8. Depending on what byte
follows 0x93 the input may not even be valid UTF-8, which I think is
what that error is telling you.

Whatever software is generating that request is probably taking
Windows CP1252 and pretending it's UTF-8. You'll need to fix that.


Mime
View raw message