commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bruno P. Kinoshita (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (LANG-1042) StringEscapeUtils.escapeHtml() does not escape single quote
Date Sat, 25 Oct 2014 02:49:35 GMT

    [ https://issues.apache.org/jira/browse/LANG-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14183879#comment-14183879
] 

Bruno P. Kinoshita commented on LANG-1042:
------------------------------------------

> Duncan: Have a good think about the current functionality, then document it better so
that people truly understand what it does and in which contexts it is useful (if any).

+1 

> Robert: what about deprecating this method and introducing a new one – secureHtmlEscape
– that escapes <, >, ', ", and &?

+0

I think adding a secureHtmlEscape method could fix this issue, but I'm not sure if that wouldn't
mean that we could have further variations or methods for XML, HTML or other formats.

> StringEscapeUtils.escapeHtml() does not escape single quote
> -----------------------------------------------------------
>
>                 Key: LANG-1042
>                 URL: https://issues.apache.org/jira/browse/LANG-1042
>             Project: Commons Lang
>          Issue Type: Bug
>            Reporter: Robert Sussland
>            Priority: Critical
>
> The String Escape Utils should ensure that encoded data cannot escape from a string.
However in HTML (starting with 1.0 and until the present), attribute values may be denoted
by either single or double quotes. Therefore single quotes need to be escaped just as much
as double quotes. 
> From the standard: http://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.2
> {quote}
> By default, SGML requires that all attribute values be delimited using either double
quotation marks (ASCII decimal 34) or single quotation marks (ASCII decimal 39). Single quote
marks can be included within the attribute value when the value is delimited by double quote
marks, and vice versa. Authors may also use numeric character references to represent double
quotes (&amp;#34\;) and single quotes (&amp;#39\;). For double quotes authors can
also use the character entity reference &amp;quot;.
> {quote}
> Note that there have been several bugs in the wild in which string encoders use this
library under the hood, and as a result fail to properly escape html attributes in which user
input is stored:
> <div title='<%=user_data%>'>Howdy</div>
> if user_data = ' onclick='payload' ' 
> then an attacker can inject their code into the page even if the developer is using the
string escape utils to escape the user string.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message