Return-Path: Delivered-To: apmail-jakarta-commons-dev-archive@apache.org Received: (qmail 47428 invoked from network); 9 Apr 2003 21:48:10 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 9 Apr 2003 21:48:10 -0000 Received: (qmail 11450 invoked by uid 97); 9 Apr 2003 21:50:06 -0000 Delivered-To: qmlist-jakarta-archive-commons-dev@nagoya.betaversion.org Received: (qmail 11443 invoked from network); 9 Apr 2003 21:50:06 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 9 Apr 2003 21:50:06 -0000 Received: (qmail 47200 invoked by uid 500); 9 Apr 2003 21:48:07 -0000 Mailing-List: contact commons-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Jakarta Commons Developers List" Reply-To: "Jakarta Commons Developers List" Delivered-To: mailing list commons-dev@jakarta.apache.org Received: (qmail 47186 invoked from network); 9 Apr 2003 21:48:07 -0000 Received: from dsl093-128-006.sfo2.dsl.speakeasy.net (HELO edamame.stinky.com) (66.93.128.6) by daedalus.apache.org with SMTP; 9 Apr 2003 21:48:07 -0000 Received: (qmail 30084 invoked by uid 510); 9 Apr 2003 21:48:25 -0000 Date: Wed, 9 Apr 2003 14:48:25 -0700 From: Alex Chaffee / Purple Technology To: Jakarta Commons Developers List Subject: [lang] Entity Escaping (HTML/XML) Message-ID: <20030409144825.D10697@stinky.com> Reply-To: alex@jguru.com Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Now escapeHtml, escapeXml, and their unescape versions are checked in. Anyone who thinks they may use them, please doublecheck my code, tests, and conversion from the DTDs. Some further thoughts... These methods use the built-in named entities (like "&" and "é") from the most current version of HTML (which is now 4.01) and XML (1.0). While most users will want to use the most current set of named entities, some will need to target a specific browser. For them, the current version may have too many entities -- their target browser may not understand what "Š" means and they would prefer the escaper use "Š" instead. Is it worth worrying about this case? If we decide to provide a solution for this, we could use: String escapeHtml(String) String escapeHtml40(String) String escapeHtml32(String) However, that doesn't scale as well as the following: String escapeHtml(String) -- always use the most current HTML String escapeEntities(String) -- use numeric escapes only String escapeEntities(String, Entities.HTML40) -- use HTML 4.0 String escapeEntities(String, Entities.HTML32) -- use HTML 3.2 ...and so on for other (as yet unknown) sets inside Entities. escapeEntities and Entities.HTMLXX are already in existence as private members. To expose them would be straightforward. And if we made the Entities class public, then they could roll their own set. This would be the most flexible but perhaps overly complicated. No urgency here, but I wanted to get my thoughts on record. Cheers - - Alex -- Alex Chaffee mailto:alex@jguru.com Purple Technology - Code and Consulting http://www.purpletech.com/ jGuru - Java News and FAQs http://www.jguru.com/alex/ Gamelan - the Original Java site http://www.gamelan.com/ Stinky - Art and Angst http://www.stinky.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: commons-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: commons-dev-help@jakarta.apache.org