tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Amy Roh <amy...@apache.org>
Subject Re: cvs commit: jakarta-tomcat-4.0/catalina/src/share/org/apache/catalina/core StandardServer.java
Date Fri, 03 Jan 2003 19:55:21 GMT
Christoph Seibert wrote:
> Hi there,
> 
> I think there is a problem with the following fix:
> 
>> amyroh      2003/01/02 17:59:09
>>
>>   Modified:    catalina/src/share/org/apache/catalina/core
>>                         StandardServer.java
>>   Log:
>>   Fix for bugzilla 15762.
> 
> [...]
> 
>>   diff -u -r1.32 -r1.33
>>   --- StandardServer.java    11 Sep 2002 14:19:33 -0000    1.32
>>   +++ StandardServer.java    3 Jan 2003 01:59:08 -0000    1.33
>>   @@ -824,7 +824,15 @@
>>                } else if (c == '"') {
>>                    filtered.append("&quot;");
>>                } else if (c == '&') {
>>   -                filtered.append("&amp;");
>>   +                char s1 = input.charAt(i+3);
>>   +                char s2 = input.charAt(i+4);
>>   +                char s3 = input.charAt(i+5);
>>   +                if (((s1 == ';') || (s2 == ';')) || (s3 == ';')) {
>>   +                    // do not convert if it's already in converted 
>> form
>>   +                    filtered.append(c);
>>   +                } else {
>>   +                    filtered.append("&amp;");
>>   +                }
>>                } else {
>>                    filtered.append(c);
>>                }
> 
> 
> (Note: I haven't had a look at the surrounding code yet, so I have to
> assume that 'i' is the position of 'c', that is the '&' character.)
> 
> This code assumes that character or entity references will not be
> shorter than 4 characters (including the delimiters '&' and ';')
> and no longer than 6. However, the XML specification does not in
> any way define restrictions like that. For example, '&d;' is a
> valid entity reference (assuming it was defined in the DTD). Worse,
> character or entity references can have arbitrary length. For example,
> '&#x0000000000020' is a valid character reference to the ' ' (space)
> character.
> 
> I'm sorry I don't have a better fix right now, but I assume one
> would have to iterate through the characters following the '&'
> until either a ';' is found or a character occurs that is not a legal
> part of an entity reference name (or in the case of a character
> reference, not one of [0-9] for decimal or [0-9a-fA-F] for
> hexadecimal).
> 
> (Actually, I believe this wheel must already have been invented,
> but with only looking at this code snippet, I don't really know.)

I believe iterating through the characters following the '&' to look for 
';' is found will fix the problem.  A character such as 
'&#x0000000000020' without following ';' will result in parsing error 
where as '&#x0000000000020;' will be written as a space(' ').

Thanks,
Amy

> 
> Ciao,
> Christoph
> 




--
To unsubscribe, e-mail:   <mailto:tomcat-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:tomcat-dev-help@jakarta.apache.org>


Mime
View raw message