axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Barzilai Spinak (JIRA)" <>
Subject [jira] Created: (AXIS-2218) Comment nodes in a doc/lit message are deserialized as text nodes
Date Thu, 15 Sep 2005 02:02:54 GMT
Comment nodes in a doc/lit message are deserialized as text nodes

         Key: AXIS-2218
     Project: Apache Axis
        Type: Bug
  Components: Serialization/Deserialization  
    Versions: 1.2.1    
 Environment: Apache Axis 1.2.1
It didn't happen with Axis 1.1 (which had other problems of its own and have been solved in
    Reporter: Barzilai Spinak

When the Axis 1.2.1 server receives a doc/lit SOAP message, and the XML data has comments
nodes, they are deserialized as Text nodes.
This means, for example,  that if the comment is commenting out a piece of XML, then <,
>, &, " will be deserialized as &amp;  &lt;  etc...
Also, utf-8 characters above 128 will be deserialized in the &#xHH; format. 

Imagine a Web Service that receives an XML document and its purpose is to save it to a file.
Imagine that another operation, then will return that XML document that was saved before.
The returned document will have garbled comments!!  If I were to resend and save the returned
document again, all these errors would propagate.
This original comment  
<!-- Abbott & Costello said "Hello"--> will become
<!-- Abbott &amp; Costello &quot;Hello&quot;--> and then 
<!-- Abbott &amp;amp; Costello said &amp;quot;Hello&amp;quot; --> and then
<!-- Abbott &amp;amp;amp; Costello said &amp;amp;quot;Hello&amp;amp;quot;-->

All this conditional "would" talk is actually true and is happening to me in my project :-)
I had to write a method that cleans up comments in the service before being processed by the
rest of the system.
cool and ugly hack :-)

As for the &#xHH; handling of higher UTF-8 characters (which my fix also cleans up), while
not being technically
wrong for Text and Attr nodes, it would be nice if they were actually deserialized as UTF-8.
Java handles Unicode
and UTF-8 and UTF-16 perfectly right!!!!  (never used/tried UTF-16 though)
So this second issue is not a bug per se, but whoever fixed the first issue could take a look
at this also :-)

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message