commons-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject svn commit: r1075691 - /commons/proper/lang/trunk/src/site/xdoc/article3_0.xml
Date Tue, 01 Mar 2011 07:31:23 GMT
Author: bayard
Date: Tue Mar  1 07:31:23 2011
New Revision: 1075691

Adding information on the text.translate package


Modified: commons/proper/lang/trunk/src/site/xdoc/article3_0.xml
--- commons/proper/lang/trunk/src/site/xdoc/article3_0.xml (original)
+++ commons/proper/lang/trunk/src/site/xdoc/article3_0.xml Tue Mar  1 07:31:23 2011
@@ -79,7 +79,31 @@ we will remove the related methods in La
 <section name="New packages">
 <p>Two new packages have shown up. org.apache.commons.lang3.concurrent, which unsurprisingly
provides support classes for 
 multi-threaded programming, and org.apache.commons.lang3.text.translate, which provides a
pluggable API for text transformation. </p>
-<!-- TODO: Add examples -->
+<!-- TODO: <h3>concurrent.*</h3> -->
+<p>A common complaint with StringEscapeUtils was that its escapeXml and escapeHtml
methods should not be escaping non-ASCII characters. We agreed and made the change while creating
a modular approach to let users define their own escaping constructs. </p>
+<p>The simplest way to show this is to look at the code that implements escapeXml:</p>
+    return ESCAPE_XML.translate(input);
+<p>Very simple. Maybe a bit too very simple, let's look a bit deeper. </p>
+    public static final CharSequenceTranslator ESCAPE_XML =
+        new AggregateTranslator(
+            new LookupTranslator(EntityArrays.BASIC_ESCAPE()),
+            new LookupTranslator(EntityArrays.APOS_ESCAPE())
+        );
+<p>Here we see that <code>ESCAPE_XML</code> is a '<code>CharSequenceTranslator</code>',
which in turn is made up of two lookup translators based on the basic XML escapes and another
to escape apostrophes. This shows one way to combine translators. Another can be shown by
looking at the example to achieve the old XML escaping functionality (escaping non-ASCII):
+          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.above(0x7f) );
+<p>That takes the standard Commons Lang provided escape functionality, and adds on
another translation layer. Another JIRA requested option was to also escape non-printable
ASCII, this is now achievable with a modification of the above: </p>
+          StringEscapeUtils.ESCAPE_XML.with( UnicodeEscaper.outsideOf(32, 0x7f) );
+<p>You can also implement your own translators (be they for escaping, unescaping or
some aspect of your own). See the <code>CharSequenceTranslator</code> and its
<code>CodePointTranslator</code> helper subclass for details - primarily a case
of implementing the translate(CharSequence, int, Writer);int method. </p>
 <section name="New classes + methods">
 <p>There are many new classes and methods in Lang 3.0 - the most complete way to see
the changes is via this <a href="lang2-lang3-clirr-report.html">Lang2 to Lang3 Clirr
report</a>. </p>
@@ -110,6 +134,7 @@ multi-threaded programming, and org.apac
 <li>StringUtils.isAlpha, isNumeric and isAlphanumeric now all return false when passed
an empty String. Previously they returned true. </li>
 <li>SystemUtils.isJavaVersionAtLeast now relies on the <code>java.specification.version</code>
and not the <code>java.version</code> System property. </li>
+<li>StringEscapeUtils.escapeXml and escapeHtml no longer escape high value unicode
characters by default. The text.translate package is available to recreate the old behaviour.

View raw message