tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Weinrich" <dwei...@home.com>
Subject Re: [PATCH]: Bugrat report 723 ( unescaping/unencoding URLs ) was 'Re: 3.2.2 Release?'
Date Fri, 09 Feb 2001 21:13:47 GMT
Just checking back on the status of this patch, I received some email from
someone who isn't actively on the list but has the same problem as
originally reported in Bugrat report 723. Anyone have any feedback on the
patch or the control characters in URL question I had?

David Weinrich

----- Original Message -----
From: "David Weinrich" <dweinr1@home.com>
To: "Tomcat Dev List" <tomcat-dev@jakarta.apache.org>
Sent: Friday, February 02, 2001 23:18
Subject: [PATCH]: Bugrat report 723 ( unescaping/unencoding URLs ) was 'Re:
3.2.2 Release?'

  Thanks to everyone, that cleared things up quite a bit. Here is the patch
for bugrat report 723 ( tomcat 3.2.x not unescaping escaped urls ). The
patch is extremely short, and implemented a tiny bit different from the one
I sent in for 3.2.x a while ago, the unencoding is done before the path is
checked for other issues/security concerns to prevent unencoded stuff from
causing these after the fact.
  Note: I haven't yet mastered the art of Watchdog/internal tomcat tests so
this will need to be tested a bit more thoroughly. So far the following urls
work correctly:

http://localhost:8080/index%20%23%24.jsp
http://localhost:8080/index%20%23%24.html

corresponding to the following filenames in the ROOT webapp dir:

'index #$.jsp' and
'index #$.html'

If an error occurs in unencoding, null is returned which ends up sending a
Not Found(404) message instead of a stack trace. This seemed to be the most
sane way to handle the issue of improperly encoded urls. This is usually the
result of having a value after a % that isn't two hex digits or having an
unencoded % in the url like:

http://localhost:8080/index%%20%23%24.jsp or
http://localhost:8080/index%zz%23%24.jsp

One last remaining concern I have: the current implementation of RequestUtil
allows control characters to pass through without raising an exception, I am
assuming this could possibly raise problems, and is fairly undesired. If I
interpret http://www.ietf.org/rfc/rfc2396.txt correctly, control characters
should not be included in URLs. If it is agreeable, I will make a patch to
RequestUtil.URLDecode tomorrow to block characters in the ranges of 00-1f
and 7f-9f to prevent this from being a potential problem. Thanks again!


David


Mime
View raw message