tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christoph Seibert <seib...@cs.uni-bonn.de>
Subject Re: cvs commit: jakarta-tomcat-4.0/catalina/src/share/org/apache/catalina/core StandardServer.java
Date Fri, 03 Jan 2003 12:03:02 GMT
Hi there,

I think there is a problem with the following fix:

> amyroh      2003/01/02 17:59:09
>
>   Modified:    catalina/src/share/org/apache/catalina/core
>                         StandardServer.java
>   Log:
>   Fix for bugzilla 15762.
[...]
>   diff -u -r1.32 -r1.33
>   --- StandardServer.java	11 Sep 2002 14:19:33 -0000	1.32
>   +++ StandardServer.java	3 Jan 2003 01:59:08 -0000	1.33
>   @@ -824,7 +824,15 @@
>                } else if (c == '"') {
>                    filtered.append("&quot;");
>                } else if (c == '&') {
>   -                filtered.append("&amp;");
>   +                char s1 = input.charAt(i+3);
>   +                char s2 = input.charAt(i+4);
>   +                char s3 = input.charAt(i+5);
>   +                if (((s1 == ';') || (s2 == ';')) || (s3 == ';')) {
>   +                    // do not convert if it's already in converted 
> form
>   +                    filtered.append(c);
>   +                } else {
>   +                    filtered.append("&amp;");
>   +                }
>                } else {
>                    filtered.append(c);
>                }

(Note: I haven't had a look at the surrounding code yet, so I have to
assume that 'i' is the position of 'c', that is the '&' character.)

This code assumes that character or entity references will not be
shorter than 4 characters (including the delimiters '&' and ';')
and no longer than 6. However, the XML specification does not in
any way define restrictions like that. For example, '&d;' is a
valid entity reference (assuming it was defined in the DTD). Worse,
character or entity references can have arbitrary length. For example,
'&#x0000000000020' is a valid character reference to the ' ' (space)
character.

I'm sorry I don't have a better fix right now, but I assume one
would have to iterate through the characters following the '&'
until either a ';' is found or a character occurs that is not a legal
part of an entity reference name (or in the case of a character
reference, not one of [0-9] for decimal or [0-9a-fA-F] for
hexadecimal).

(Actually, I believe this wheel must already have been invented,
but with only looking at this code snippet, I don't really know.)

Ciao,
Christoph

-- 
--- Christoph Seibert                   seibert@cs.uni-bonn.de ---
-- Farlon Dragon -==(UDIC)==-    http://home.pages.de/~seibert/ --
- Who can possibly rule if no one                                -
-         who wants to can be allowed to?     - D. Adams, HHGTTG -


--
To unsubscribe, e-mail:   <mailto:tomcat-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:tomcat-dev-help@jakarta.apache.org>


Mime
View raw message