axis-java-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amila Suriarachchi <amilasuriarach...@gmail.com>
Subject Re: Handling xml strings with Axiom
Date Mon, 01 Aug 2011 10:55:18 GMT
it seems like it is not a MUST to escape > [1]

thanks,
Amila.

[1] http://www.w3.org/TR/xml/#syntax

On Mon, Aug 1, 2011 at 3:42 PM, Amila Suriarachchi <
amilasuriarachchi@gmail.com> wrote:

> I wrote the following code to test another issue I came across.
>
>  String xmlString = "<test>mytest</test>";
>
>         OMFactory omFactory = OMAbstractFactory.getOMFactory();
>         OMNamespace omNamespace = omFactory.createOMNamespace("
> http://mynamespace","ns1");
>         OMElement omElement = omFactory.createOMElement("TestElement",
> omNamespace);
>
>         omElement.setText(xmlString);
>
>         System.out.println("OMElement text ==> " + omElement.toString());
>
> This out puts an xml string like this,
>
> <ns1:TestElement xmlns:ns1="http://mynamespace
> ">&lt;test>mytest&lt;/test></ns1:TestElement>
>
> Note that > is not encoded. is this a correct behavior? By looking at the
> woodstox code I found that
> it intensionally escapse > only if it is after an ]
>
> for (; offset < len; ++offset) {
>                     c = cbuf[offset];
>                     if (c <= HIGHEST_ENCODABLE_TEXT_CHAR) {
>                         if (c == '<') {
>                             ent = "&lt;";
>                             break;
>                         } else if (c == '&') {
>                             ent = "&amp;";
>                             break;
>                         } else if (c == '>') {
>                             /* Let's be conservative; and if there's any
>                              * change it might be part of "]]>" quote it
>                              */
>                             if ((offset == start) || cbuf[offset-1] == ']')
> {
>                                 ent = "&gt;";
>                                 break;
>                             }
>                         } else if (c < 0x0020) {
>                             if (c == '\n' || c == '\t') { // fine as is
>                                 ;
>                             } else if (c == '\r') {
>                                 if (mEscapeCR) {
>                                     break;
>                                 }
>                             } else {
>                                 if (!mXml11 || c == 0) {
>                                     throwInvalidChar(c);
>                                 }
>                                 break; // need quoting ok
>                             }
>                         }
>                     } else if (c >= highChar) {
>                         break;
>                     }
>                     // otherwise ok
>                 }
>
> is there any reason for this?
>
> thanks,
> Amila.
>
> --
> Amila Suriarachchi
> WSO2 Inc.
> blog: http://amilachinthaka.blogspot.com/
>



-- 
Amila Suriarachchi
WSO2 Inc.
blog: http://amilachinthaka.blogspot.com/

Mime
View raw message