Return-Path: Delivered-To: apmail-commons-issues-archive@locus.apache.org Received: (qmail 35010 invoked from network); 9 Jun 2008 07:56:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 9 Jun 2008 07:56:11 -0000 Received: (qmail 47749 invoked by uid 500); 9 Jun 2008 07:56:13 -0000 Delivered-To: apmail-commons-issues-archive@commons.apache.org Received: (qmail 47305 invoked by uid 500); 9 Jun 2008 07:56:12 -0000 Mailing-List: contact issues-help@commons.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: issues@commons.apache.org Delivered-To: mailing list issues@commons.apache.org Received: (qmail 47290 invoked by uid 99); 9 Jun 2008 07:56:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jun 2008 00:56:11 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 09 Jun 2008 07:55:23 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 154F1234C136 for ; Mon, 9 Jun 2008 00:55:45 -0700 (PDT) Message-ID: <203855166.1212998145086.JavaMail.jira@brutus> Date: Mon, 9 Jun 2008 00:55:45 -0700 (PDT) From: "Jochen Wiedmann (JIRA)" To: issues@commons.apache.org Subject: [jira] Commented: (LANG-439) StringEscapeUtils.escapeHTML() does not escape chars (0x00-0x20) In-Reply-To: <1166462737.1212157005229.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/LANG-439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12603496#action_12603496 ] Jochen Wiedmann commented on LANG-439: -------------------------------------- It is crystal clear, that escapeXml *must* not escape such characters, but should throw an exception. I haven't got any idea for HTML documents. I'd be in favour for the same handling as XML, though, for practical reasons. Whoever needs these binary characters should use BASE64 or something similar. At least I'd wait for an explicit hint, that escaped 0x00 characters *are* valid in HTML. However, I must admit, that I do not like the current implementation. Simply *ignoring* such characters is, IMO, worse than trying to escape them. IMO, we should throw an exception, if we find characters that we suspect to be invalid. > StringEscapeUtils.escapeHTML() does not escape chars (0x00-0x20) > ---------------------------------------------------------------- > > Key: LANG-439 > URL: https://issues.apache.org/jira/browse/LANG-439 > Project: Commons Lang > Issue Type: Bug > Affects Versions: 2.4 > Environment: java5 > Reporter: Pavel Sivolobtchik > Fix For: 3.0 > > > I encountered this problem when I sent html from the server to a client using AjaxRequest. HTML was escaped wrapped in CDATA. I thought it was pretty safe. See my xml fragment below: > //------------------------------------------------------------------------------------------ > > > >
> May 29 10:48:29 rdia643 su: - 2 nitroqa-nss
> ]]> >
>
> //------------------------------------------------------------------------------------------ > However in FF2 there was js error: > //-------------------------------------------------------------------------------------------- > Error: not well-formed > Source Code: > May 29 10:48:29 rdia643 su: - 2 nitroqa-nss -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^ > I figured out that StringEscapeUtils.escapeHTML() did not escape one of the characters. it was a '\b'(ascii 8). > I had to change to org.apache.commons.lang.Entities.excape() method: > public void escape(Writer writer, String str) throws IOException { > int len = str.length(); > for (int i = 0; i < len; i++) { > char c = str.charAt(i); > String entityName = this.entityName(c); > if (entityName == null) { > if (c < 0x20 || c > 0x7F) { > writer.write("&#"); > writer.write(Integer.toString(c, 10)); > writer.write(';'); > } > else { > writer.write(c); > } > } > else { > writer.write('&'); > writer.write(entityName); > writer.write(';'); > } > } > } > //--------------------------------------------------------------------------------------- > It can be tested with unittest: > import java.io.Reader; > import java.io.StringReader; > import junit.framework.TestCase; > import org.apache.commons.lang.StringEscapeUtils; > import org.jdom.input.SAXBuilder; > public class StringEscapeUtilsTest extends TestCase { > public void testPR73092() throws Exception { > StringBuilder test = new StringBuilder(50); > for (int i = 0; i <= 50; i++) { > test.append((char)i); > } > StringBuilder result = new StringBuilder("\n result.append(StringEscapeUtils.escapeHtml(test.toString())); > result.append("\n]]>\n\n"); > validate(new StringReader(result.toString())); > result = new StringBuilder("\n result.append(test.toString()); > result.append("\n]]>\n\n"); > try { > validate(new StringReader(result.toString())); > fail("expected to blow up"); > } > catch (Exception e) { > // > } > } > /** make sure that xml is well-formed */ > private static void validate(Reader xmlSource) throws Exception { > SAXBuilder saxBuilder = new SAXBuilder(); > saxBuilder.build(xmlSource); > } > } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.