commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ilguiz Latypov (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (TEXT-42) [XSS] Possible attacks through StringEscapeUtils.escapeEcmaScript?
Date Sun, 29 Oct 2017 16:11:00 GMT

    [ https://issues.apache.org/jira/browse/TEXT-42?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16223929#comment-16223929
] 

Ilguiz Latypov edited comment on TEXT-42 at 10/29/17 4:10 PM:
--------------------------------------------------------------

I wonder if the escapeEcmaScript()'s use cases can be scrutinized.

* Outputting a standalone javascript file containing string literals.  The generation of string
literals to be surrounded by double or single quotes seems to be covered by the existing code
in escapeEcmaScript().
{code:java}
String dq = Character.toString('"');
out.println("alert(" + dq + escapeEcmaScript(input) + dq + ");");
{code}
* Outputting an HTML attribute containing javascript containing string literals.  This needs
a new method *escapeHtmlAttr*.  Depending on the surrounding quotes or absence of them, all
characters of the attribute value will go through either a minimal substitution of [single/double
quotes and ampersand|https://html.spec.whatwg.org/multipage/parsing.html#attribute-value-(double-quoted)-state]
with the HTML entity or through a broader replacement of [whitespace, ampersand, single/double
quotes, equals, greater/less-than and backquotes|https://html.spec.whatwg.org/multipage/parsing.html#attribute-value-(unquoted)-state].
Safety calls to use the broader escaping by default (and allow the narrow one as an option).
I.e.
{code:java}
out.println("onmouseover=" + dq + escapeHtmlAttr("alert(" + dq + escapeEcmaScript(input) +
dq + ")") + dq);
{code}
* Outputting string literals in the script tag contents. Because browsers allow readable javascript
between the script tags, browsers [do not apply a straight decoding algorithm|https://stackoverflow.com/questions/41297404/is-it-possible-to-correctly-escape-arbitrary-script-tag-contents]
similar to one in HTML attributes.  The code of escapeEcmaScript omitting the ampersand character
from escaping follows this rule and therefore avoids redundant escaping.
&nbsp;
Another decoding still applies and the escaping code appears vulnerable to it.  According
to the WHATWG HTML parsing rules, the end script tag </script> will disrupt javascript
parsing in any state.  But thanks to LANG-363 in 2007, the javascript string literal escaping
already prevents from injecting the end script tag by backslash-escaping the forward slash
'/'.
{code:java}
out.println("<script>alert(" + dq + escapeEcmaScript(input) + dq + ")</script>");
{code}



was (Author: ilatypov):
I wonder if the escapeEcmaScript()'s use cases can be scrutinized.

* Outputting a standalone javascript file containing string literals.  The generation of string
literals to be surrounded by double or single quotes seems to be covered by the existing code
in escapeEcmaScript().
{code:java}
String dq = Character.toString('"');
out.println("alert(" + dq + escapeEcmaScript(input) + dq + ");");
{code}
* Outputting an HTML attribute containing javascript containing string literals.  This needs
a new method *escapeHtmlAttr*.  Depending on the surrounding quotes or absence of them, all
characters of the attribute value will go through either a minimal substitution of [single/double
quotes and ampersand|https://html.spec.whatwg.org/multipage/parsing.html#attribute-value-(double-quoted)-state]
with the HTML entity or through a broader replacement of [whitespace, ampersand, single/double
quotes, equals, greater/less-than and backquotes|https://html.spec.whatwg.org/multipage/parsing.html#attribute-value-(unquoted)-state].
Safety calls to use the broader escaping by default (and allow the narrow one as an option).
I.e.
{code:java}
out.println("onmouseover=" + dq + escapeHtmlAttr("alert(" + dq + escapeEcmaScript(input) +
dq + ")") + dq);
{code}
* Outputting string literals in the script tag contents. Because browsers allow readable javascript
between the script tags, browsers [do not apply a straight decoding algorithm|https://stackoverflow.com/questions/41297404/is-it-possible-to-correctly-escape-arbitrary-script-tag-contents]
similar to one in HTML attributes.  The code of escapeEcmaScript omitting the ampersand character
from escaping follows this rule and therefore avoids redundant escaping.
&nbsp;
Another decoding still applies and the escaping code appears vulnerable to it.  According
to the WHATWG HTML parsing rules, the end script tag </script> will disrupt javascript
parsing in any state.  But thanks to LANG-421 in 2008, the javascript string literal escaping
already prevents from injecting the end script tag by backslash-escaping the forward slash
'/'.
{code:java}
out.println("<script>alert(" + dq + escapeEcmaScript(input) + dq + ")</script>");
{code}


> [XSS] Possible attacks through StringEscapeUtils.escapeEcmaScript?
> ------------------------------------------------------------------
>
>                 Key: TEXT-42
>                 URL: https://issues.apache.org/jira/browse/TEXT-42
>             Project: Commons Text
>          Issue Type: Bug
>            Reporter: Andy Reek
>              Labels: XSS
>             Fix For: 1.x
>
>
> org.apache.commons.lang3.StringEscapeUtils.escapeEcmaScript does the escape via a prefixed
'\' on all characters which must be escaped. I am not sure if this is really secure, if am
looking at the comments on https://www.owasp.org/index.php/XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet#RULE_.233_-_JavaScript_Escape_Before_Inserting_Untrusted_Data_into_JavaScript_Data_Values.
They say it is possible to do an attack by escape the escape. I tested this with the string
'\"' and the output was '\\\"'. Is this really ecma-/java-script secure? Or is it better to
use the implementation used by OWASP?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message