commons-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jochen Wiedmann (JIRA)" <j...@apache.org>
Subject [jira] Commented: (LANG-439) StringEscapeUtils.escapeHTML() does not escape chars (0x00-0x20)
Date Mon, 09 Jun 2008 07:55:45 GMT

    [ https://issues.apache.org/jira/browse/LANG-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603496#action_12603496
] 

Jochen Wiedmann commented on LANG-439:
--------------------------------------

It is crystal clear, that escapeXml *must* not escape such characters, but should throw an
exception.

I haven't got any idea for HTML documents. I'd be in favour for the same handling as XML,
though, for practical reasons. Whoever needs these binary characters should use BASE64 or
something similar. At least I'd wait for an explicit hint, that escaped 0x00 characters *are*
valid in HTML.

However, I must admit, that I do not like the current implementation. Simply *ignoring* such
characters is, IMO, worse than trying to escape them. IMO, we should throw an exception, if
we find characters that we suspect to be invalid.



> StringEscapeUtils.escapeHTML() does not escape chars (0x00-0x20)
> ----------------------------------------------------------------
>
>                 Key: LANG-439
>                 URL: https://issues.apache.org/jira/browse/LANG-439
>             Project: Commons Lang
>          Issue Type: Bug
>    Affects Versions: 2.4
>         Environment: java5
>            Reporter: Pavel Sivolobtchik
>             Fix For: 3.0
>
>
> I encountered this problem when I sent html from the server to a client using AjaxRequest.
HTML was escaped wrapped in CDATA. I thought it was pretty safe. See my xml fragment below:
> //------------------------------------------------------------------------------------------
> <?xml version="1.0" encoding="UTF-8"?>
> <ajax-fragment>
> <html-rows>
> <![CDATA[
> <div style="padding-left: 1px;" class="columnContent4  column4">
> <span  column-id="Message"  class="cellContent"  onmouseover="w12450823.onDwell(event);
w12450823.onCellSelectionOnMouseOver(event);"  onclick="w12450823.onCellSelectionOnClick(event)"
 >May 29 10:48:29 rdia643 su: - 2 nitroqa-nss</span></div>
> ]]>
> </html-rows>
> </ajax-fragment>
> //------------------------------------------------------------------------------------------
> However in FF2 there was js error:
> //--------------------------------------------------------------------------------------------

> Error: not well-formed
> Source Code:
> <span  column-id="Message"  class="cellContent "  onmouseover="w12450823.onDwell(event);
w12450823.onCellSelectionOnMouseOver(event); " onclick="w12450823.onCellSelectionOnClick(event)"
 >May 29 10:48:29 rdia643 su: - 2 nitroqa-nss</span></div
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^
> I figured out that StringEscapeUtils.escapeHTML() did not escape one of the characters.
it was a '\b'(ascii 8).
> I had to change to org.apache.commons.lang.Entities.excape() method:
> public void escape(Writer writer, String str) throws IOException {
> 	int len = str.length();
> 	for (int i = 0; i < len; i++) {
> 		char c = str.charAt(i);
> 		String entityName = this.entityName(c);
> 		if (entityName == null) {
> 			if (c < 0x20 || c > 0x7F) {
> 				writer.write("&#");
> 				writer.write(Integer.toString(c, 10));
> 				writer.write(';');
> 			}
> 			else {
> 				writer.write(c);
> 			}
> 		}
> 		else {
> 			writer.write('&');
> 			writer.write(entityName);
> 			writer.write(';');
> 		}
> 	}
> }
> //---------------------------------------------------------------------------------------
> It can be tested with unittest:
> import java.io.Reader;
> import java.io.StringReader;
> import junit.framework.TestCase;
> import org.apache.commons.lang.StringEscapeUtils;
> import org.jdom.input.SAXBuilder;
> public class StringEscapeUtilsTest extends TestCase {
> public void testPR73092() throws Exception {
> 	StringBuilder test = new StringBuilder(50);
> 	for (int i = 0; i <= 50; i++) {
> 		test.append((char)i);
> 	}
> 	StringBuilder result = new StringBuilder("<test>\n<![CDATA[\n");
> 	result.append(StringEscapeUtils.escapeHtml(test.toString()));
> 	result.append("\n]]>\n</test>\n");
> 	validate(new StringReader(result.toString()));
> 	result = new StringBuilder("<test>\n<![CDATA[\n");
> 	result.append(test.toString());
> 	result.append("\n]]>\n</test>\n");
> 	try {
> 		validate(new StringReader(result.toString()));
> 		fail("expected to blow up");
> 	}
> 	catch (Exception e) {
> 		//
> 	}
> }
> /** make sure that xml is well-formed */
> private static void validate(Reader xmlSource) throws Exception {
> 	SAXBuilder saxBuilder = new SAXBuilder();
> 	saxBuilder.build(xmlSource);
> }
> }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message