commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pelle Nilsson <>
Subject [lang] 8-bit characters and StringEscapeUtils
Date Thu, 16 Oct 2003 09:03:54 GMT

When using StringEscapeUtils to generate XML or HTML to be encoded as
UTF-8 or ISO-8859-1 it should perfectly ok (I think) to leave certain
symbols unescaped (like swedish characters å, ä and ö) instead of
reaplacing them with markup entities. Of course depending on the
encoding of the document the strings are meant to be used with there
are different sets of characters that can be left unescaped. Have
anyone thought of adding encoding sensitive methods to
StringEscapeUtils (like has been done in many places in java.lang and
java.util in later versions of Java)? Something like

    public static String escapeXml(String str, String encoding);
    public static String escapeHtml(String str, String encoding);

    escapeHtml("ä&") => "&auml;&amp;"
    escapeHtml("ä&, "US-ASCII") => "&auml;&amp;"
    escapeHtml("ä&", "UTF-8") => "ä&amp;"


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message