axis-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Burati" <>
Subject UTF-8 char issue with 1.2RC2 ?
Date Wed, 02 Feb 2005 17:05:39 GMT
I was looking into an issue a coworker ran into calling the Xmethods Babelfish translation
service (specifically to test out non-ascii chars), and decided to try a simple WSDL2Java
based client against the service to see how it compared to how he was using AXIS (1.2RC2),
and I was able to reproduce nearly the same problem.
I figured I'd ask here before digging too deep on this one - hoping that maybe someone can
spot either what I'm doing wrong or what the service itself is doing wrong, or what may be
a bug in 1.2RC2?  I'm sure I've seen non-ascii chars work just fine in AXIS response envelopes
before, so I'm not sure what's up with this one...
Service is at:
Sample request envelope
<soapenv:Envelope xmlns:soapenv="" xmlns:xsd=""
soapenv:encodingStyle="" xmlns:ns1="urn:xmethodsBabelFish"><translationmode
xsi:type="xsd:string">en_fr</translationmode><sourcedata xsi:type="xsd:string">I'm
going to the beach.</sourcedata></ns1:BabelFish></soapenv:Body></soapenv:Envelope>
Sample response envelope:
<?xml version="1.0" encoding="UTF-8"?><SOAP-ENV:Envelope xmlns:SOAP-ENC=""
SOAP-ENV:encodingStyle="" xmlns:xsi=""
xmlns:SOAP-ENV="" xmlns:xsd=""><SOAP-ENV:Body><namesp1:BabelFishResponse
xmlns:namesp1="urn:xmethodsBabelFish"><return xsi:type="xsd:string">je vais à la
plage </return></namesp1:BabelFishResponse></SOAP-ENV:Body></SOAP-ENV:Envelope>

With an HTTP content type header of:
Content-Type: text/xml; charset=utf-8

The AXIS Fault I'm getting back in the simple WSDL2Java generated client (unit test hand-modified
to just run as a standalone client without Junit) is slightly different than what my coworker
is getting, but both are the same below Message.getSOAPEnvelope in the stack, where it's trying
to parse the XML for the SOAP envelope in both cases, via the DeserializationContext for the
response envelope.
The encoding for the response envelope is marked UTF-8 as is the HTTP response headers associated
with that envelope.  The envelope looks fairly valid by eye and a simple Java program that
just posted the above request to the same service and read and dumped out the response using
a UTF-8 encoding didn't complain about any of the bytes in the response.

Note:  I don't get the error if the response contains just ascii chars.
Here's one exception I'm getting in the standalone test (MustUnderstand checker asks for the
SOAP env):
	{}stackTrace:org.xml.sax.SAXParseException: Character conversion
error: &quot;Malformed UTF-8 char -- is an XML encoding declaration missing?&quot;
(line number may be too low).
	at org.apache.crimson.parser.InputEntity.fatal(
	at org.apache.crimson.parser.InputEntity.fillbuf(
	at org.apache.crimson.parser.InputEntity.isXmlDeclOrTextDeclPrefix(
	at org.apache.crimson.parser.Parser2.maybeXmlDecl(
	at org.apache.crimson.parser.Parser2.parseInternal(
	at org.apache.crimson.parser.Parser2.parse(
	at org.apache.crimson.parser.XMLReaderImpl.parse(
	at javax.xml.parsers.SAXParser.parse(
	at org.apache.axis.encoding.DeserializationContext.parse(
	at org.apache.axis.SOAPPart.getAsSOAPEnvelope(
	at org.apache.axis.Message.getSOAPEnvelope(

And here's another another almost identical one from another batch of client code using AXIS
directly (without WSDL2Java)

 faultCode: {}Server.userException
 faultString: Invalid byte 2 of 3-byte UTF-8 seq
Invalid byte 2 of 3-byte UTF-8 sequence.
        at Source)
        at Source)
        at org.apache.xerces.impl.XMLEntityScanner.load(Unknown Source)
        at org.apache.xerces.impl.XMLEntityScanner.scanContent(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanContent(Unk
nown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContent
Dispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Un
known Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at javax.xml.parsers.SAXParser.parse(Unknown Source)
        at org.apache.axis.encoding.DeserializationContext.parse(Deserialization
        at org.apache.axis.SOAPPart.getAsSOAPEnvelope(
        at org.apache.axis.Message.getSOAPEnvelope(
        at org.apache.axis.handlers.soap.MustUnderstandChecker.invoke(MustUnders
        at org.apache.axis.client.AxisClient.invoke(
        at org.apache.axis.client.Call.invokeEngine(
        at org.apache.axis.client.Call.invoke(
        at org.apache.axis.client.Call.invoke(
        at org.apache.axis.client.Call.invoke(
        at org.apache.axis.client.Call.invoke(

View raw message