commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Barrs (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (LANG-898) StringEscapeUtils un/escapexml inconsistant with escaped whitespace
Date Fri, 07 Jun 2013 17:58:21 GMT

     [ https://issues.apache.org/jira/browse/LANG-898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Martin Barrs updated LANG-898:
------------------------------

    Description: 
In an escaped XML string with escaped whitespace, in this case linefeed ( & # 10; ), escapexml
and unescapexml treat the linefeed inconsistently. 

unescape converts & # 10; to a linefeed, yet escapexml does not convert linefeed back
to &#10;

I've put spaces between the & and # and 10 in this bug as Jira will interpret it: 
& # 10;

Here's code and output...


public static void main(String[] args) {
        String escaped =
                "&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&
#10 ;&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;";

        System.out.println(escaped);
        System.out.println();
        System.out.println(StringEscapeUtils.unescapeXml(escaped));
        System.out.println();
        System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
                .unescapeXml(escaped)));

    }
    

Output:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml
version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml version="1.0" encoding="iso-8859-1"?>

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;

  was:
In an escaped XML string with escaped whitespace, in this case linefeed ( &#10; ), escapexml
and unescapexml treat the linefeed inconsistently. 

unescape converts &#10; to a linefeed, yet escapexml does not convert linefeed back to
&#10;


Here's code and output...


public static void main(String[] args) {
        String escaped =
                "&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml
version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;";

        System.out.println(escaped);
        System.out.println();
        System.out.println(StringEscapeUtils.unescapeXml(escaped));
        System.out.println();
        System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
                .unescapeXml(escaped)));

    }
    

Output:

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml
version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;

<?xml version="1.0" encoding="iso-8859-1"?>
<?xml version="1.0" encoding="iso-8859-1"?>

&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;

    
> StringEscapeUtils un/escapexml inconsistant with escaped whitespace
> -------------------------------------------------------------------
>
>                 Key: LANG-898
>                 URL: https://issues.apache.org/jira/browse/LANG-898
>             Project: Commons Lang
>          Issue Type: Bug
>          Components: General, lang.*
>    Affects Versions: 3.1
>         Environment: Windows 7, Java 7
>            Reporter: Martin Barrs
>
> In an escaped XML string with escaped whitespace, in this case linefeed ( & # 10;
), escapexml and unescapexml treat the linefeed inconsistently. 
> unescape converts & # 10; to a linefeed, yet escapexml does not convert linefeed
back to &#10;
> I've put spaces between the & and # and 10 in this bug as Jira will interpret it:

> & # 10;
> Here's code and output...
> public static void main(String[] args) {
>         String escaped =
>                 "&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&
#10 ;&lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;";
>         System.out.println(escaped);
>         System.out.println();
>         System.out.println(StringEscapeUtils.unescapeXml(escaped));
>         System.out.println();
>         System.out.println(StringEscapeUtils.escapeXml(StringEscapeUtils
>                 .unescapeXml(escaped)));
>     }
>     
> Output:
> &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;&#10;&lt;?xml
version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
> <?xml version="1.0" encoding="iso-8859-1"?>
> <?xml version="1.0" encoding="iso-8859-1"?>
> &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;
> &lt;?xml version=&quot;1.0&quot; encoding=&quot;iso-8859-1&quot;?&gt;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message