commons-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Gregory <>
Subject RE: [lang] Entity Escaping (HTML/XML)
Date Wed, 09 Apr 2003 22:31:18 GMT
I would prefer to see HTML code in an HTML class (HTMLUtils?), with
subclasses for both HTML versions. HTML is clearly semantically different
than simple Strings and StringUtils is already a 2000+ lines giant; these
are new methods, so there is no back/w compatibility issues. IMHO, small
classes with small methods seem to be a better philosophy in this particular

Having methods with version numbers does not seem very OO.

Furthermore, if I wanted to migrate an app from "no version" HTML to 3.2 and
then to 4.0, I would have to change call sites all over my app. With an HTML
set of classes, I would just change one line of code with the class name or
perhaps put the class name in a properties file for even easier

Gary G.

-----Original Message-----
From: Alex Chaffee / Purple Technology [] 
Sent: Wednesday, April 09, 2003 2:48 PM
To: Jakarta Commons Developers List
Subject: [lang] Entity Escaping (HTML/XML)

Now escapeHtml, escapeXml, and their unescape versions are checked in.
Anyone who thinks they may use them, please doublecheck my code,
tests, and conversion from the DTDs.

Some further thoughts...

These methods use the built-in named entities (like "&amp;" and
"&eacute;") from the most current version of HTML (which is now 4.01)
and XML (1.0).

While most users will want to use the most current set of named
entities, some will need to target a specific browser.  For them, the
current version may have too many entities -- their target browser may
not understand what "&Scaron;" means and they would prefer the escaper
use "&#352;" instead.

Is it worth worrying about this case?

If we decide to provide a solution for this, we could use:

	String escapeHtml(String)
	String escapeHtml40(String)
	String escapeHtml32(String)

However, that doesn't scale as well as the following:

	String escapeHtml(String)   -- always use the most current HTML
	String escapeEntities(String) -- use numeric escapes only
	String escapeEntities(String, Entities.HTML40) -- use HTML 4.0 
	String escapeEntities(String, Entities.HTML32) -- use HTML 3.2

	...and so on for other (as yet unknown) sets inside Entities.

escapeEntities and Entities.HTMLXX are already in existence as private
members.  To expose them would be straightforward.

And if we made the Entities class public, then they could roll their
own set.  This would be the most flexible but perhaps overly

No urgency here, but I wanted to get my thoughts on record.

Cheers -

 - Alex

Alex Chaffee                     
Purple Technology - Code and Consulting
jGuru - Java News and FAQs       
Gamelan - the Original Java site 
Stinky - Art and Angst           

To unsubscribe, e-mail:
For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message