jakarta-taglibs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Henri Yandell <bay...@generationjava.com>
Subject RE: Proposal For Addition of org.apache.taglibs.string.EscapeAsciiTag
Date Tue, 18 Mar 2003 17:41:31 GMT

I think UTF-8 is fine. As it'll still be a 2-byte unicode when it hits the
escaping-code.

I still think you should treat it as two tags, one for XML and one for
HTML. Afaik, XML is unicode, so you only need the 4 basic escapes, whereas
as HTML is ascii(?), you need to escape all the other characters. So
escapeHtml could be implemented as a call to escapeXml, and then lots of
html specific bits.

I'm thinking of:

<str:escapeXml>...</str:escapeXml>
and
<str:escapeHtml useEntities="true">...</str:escapeHtml>

where normally escapeHtml uses &161; etc, but if useEntities is true, it
tries to fill in &copy; etc.

An alternative is to have:

<str:escape type="xml"> and <str:escape type="html"> which helps keep the
number of tags minimised.

Implemented as:

EscapeXmlTag and EscapeHtmlTag, which both extends StringTagSupport, or as
a part of EscapeTag.
Underneath they would talk to an XmlUtils class which contains
escapeXml(String) and escapeHtml(String, boolean)

Actually, if you look in String Taglib, there is already an XmlW class [I
used to use W for Wrapper, rather than Utils, hidden legacy] which
contains an escapeXml method. It just needs the escapeHtml method.

Brownie-points: Unescape methods :)

So, the major decision points seem to be:

1) Separate tags, or wrap them all into the str:escape tag?

2) Anything wrong with XmlW.escapeXml:

    static public String escapeXml(String str) {
        str = StringUtils.replace(str,"&","&amp;");
        str = StringUtils.replace(str,"<","&lt;");
        str = StringUtils.replace(str,">","&gt;");
        str = StringUtils.replace(str,"\"","&quot;");
        str = StringUtils.replace(str,"'","&apos;");
        return str;
    }

 and could you provide an escapeHtml(String, boolean) with the previously
mentioned logic?

StringTagSupport makes the tags themselves very easy to do, so I'm happy
to handle this, or to commit yours if you code them.

Thoughts?

Hen

On Tue, 18 Mar 2003, Rusty Lowrey wrote:

> Good point Henry, the tag library would probably be more appropriately named
> as EscapeSpecialCharsTag instead.  I had always referred to these characters
> (like &#161;) as ascii equivalents.  Currently, the tag library converts a
> string into UTF-8, then converts to the escape character.  I know it works
> with Latin-1, but am unsure about how other encodings would be handled.  If
> I used UTF-16, would all encodings work?  Or, will UTF-8 be adequate?
>
> Currently, the tag library outputs commonly used characters such as &, <, ",
> or > as the more human-readable format such as &amp;, &lt;, &quot;, or
&gt;.
> Any other characters use the numeric version such as &#161, etc...
>
> Rusty
>
>
>
> -----Original Message-----
> From: Henri Yandell [mailto:bayard@generationjava.com]
> Sent: Monday, March 17, 2003 5:35 PM
> To: Tag Libraries Developers List
> Subject: Re: Proposal For Addition of
> org.apache.taglibs.string.EscapeAsciiTag
>
>
>
> +1 from me, though aren't they called 'character entities' rather than
> 'ascii'?
>
> Other notes:
>
> There was originally an escapeXml tag [with just amp, lt, gt and quot],
> but there was some reason why people didn't think it should be in String.
> I think it was just that some believed that the xml/html bits shouldn't be
> in a String taglib. I think it's a common request/desire though.
>
> [JSTL does have some escape-xml ability, though nothing is explicitly
> available afaik]
>
> So I'm in favour of an escapeXml and an escapeHtml set of tags.
>
> A problem;
>
> You call it escapeAscii, though really it is escapeLatin-1, or
> escape-Lowerbyte-UTF16. Is this something the tag needs to worry about? Or
> should it only work with latin-1 and unicode?
>
> Possibly there could be an argument for the encoding, then the tag could
> convert that to UTF16 [normal Unicode] and then your conversion to escape
> entities specified.
>
> ---
>
> One other thing I'd like to suggest. Do you have an attribute which
> specifies whether character entities or ascii-values are used in the
> escaped *ml?
>
> Hope that's all thought provoking,
>
> Hen
>
> On Mon, 17 Mar 2003, Rusty Lowrey wrote:
>
> > Proposal For Addition of org.apache.taglibs.string.EscapeAsciiTag
> >
> > I) Motivation
> > This tag library converts any string enclosed within its body into proper
> > ascii format for display in a browser. Special characters such as &, <, >,
> > ", ', copyright symbols, registered trademark symbols, special characters
> > for other languages with accents, etc... (full list -
> > http://www.ramsch.org/martin/uni/fmi-hp/iso8859-1.html) do not always
> > display properly within a jsp or html page. These characters must be
> escaped
> > to the ascii equivalents such as &amp;, &lt;, &gt;, &quot; &#169,
etc...
> to
> > display properly.
> >
> > An example might be text containing double-quotes that must be included as
> a
> > jsp or html attribute.
> >
> > This tag library also allows display of raw html or xml code within a
> > browser.
> >
> > Also, this tag library can be used when a user submits text from a form
> that
> > may have been copied and pasted from Microsoft Word. Word generates many
> > special characters including smart quotes, middle dots and ellipses that
> may
> > not display properly in the browser.
> >
> > This tag library could also be used to escape the single quote character
> > within SQL code. I believe that the <sql:escapeSql> tag already does this
> > though.
> >
> > II) Overview
> > This JSP code:
> >
> > <str:escapeAscii>Test "" <&> </str:escapeAscii>
> >
> > would be converted to:
> >
> > Test &quot;&quot; &lt;&amp;&gt; &#169;&#174;
> >
> > which would display properly in the browser as:
> >
> > Test "" <&> 
> >
> >
> >
> > Display HTML code such as:
> >
> >
> <str:escapeAscii><html><head><title>Test</title></head><body>Test</body></ht
> > ml></str:escapeAscii>
> >
> > would be converted to:
> >
> >
> &lt;html&gt;&lt;head&gt;&lt;title&gt;Test&lt;/title&gt;&lt;/head&gt;&lt;body
> > &gt;Test&lt;/body&gt;&lt;/html&gt;
> >
> > which would display properly in the browser as:
> >
> > <html><head><title>Test</title></head><body>Test</body></html>
> >
> >
> >
> > Also display XML code in the same manner.
> >
> > III) Requirements
> > JSP 1.1 compatible - no outside dependencies
> >
> > IV) Commitment
> > I am willing to assume the role of committer. The code is already written
> > and has been tested locally.
> >
> > Thanks,
> >
> > Rusty Lowrey
> > rustylowrey@earthlink.net
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: taglibs-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: taglibs-dev-help@jakarta.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: taglibs-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: taglibs-dev-help@jakarta.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: taglibs-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: taglibs-dev-help@jakarta.apache.org


Mime
View raw message