tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yansheng Lin" <>
Subject RE: URL encoding/decoding bug in form-based security?
Date Fri, 06 Feb 2004 22:15:50 GMT
Hey, check out the following section on URI/URL Standard Specification(RFC

2.4.3. Excluded US-ASCII Characters

   Although they are disallowed within the URI syntax, we include here a
   description of those US-ASCII characters that have been excluded and
   the reasons for their exclusion.

   The control characters in the US-ASCII coded character set are not
   used within a URI, both because they are non-printable and because
   they are likely to be misinterpreted by some control mechanisms.

   control     = <US-ASCII coded characters 00-1F and 7F hexadecimal>

   The space character is excluded because significant spaces may
   disappear and insignificant spaces may be introduced when URI are
   transcribed or typeset or subjected to the treatment of word-
   processing programs.  Whitespace is also used to delimit URI in many

   space       = <US-ASCII coded character 20 hexadecimal>

   The angle-bracket "<" and ">" and double-quote (") characters are
   excluded because they are often used as the delimiters around URI in
   text documents and protocol fields.  The character "#" is excluded
   because it is used to delimit a URI from a fragment identifier in URI
   references (Section 4). The percent character "%" is excluded because
   it is used for the encoding of escaped characters.

   delims      = "<" | ">" | "#" | "%" | <">

-----Original Message-----
From: Bill Haake [] 
Sent: Friday, February 06, 2004 2:13 PM
To: Tomcat Users List
Subject: URL encoding/decoding bug in form-based security?

I have been working on tracking down a problem with special characters in
URLs that shows up when using form-based authentication in a security
constraint. I have just about reached the limit of my ability to find the
problem and am hoping that someone more familiar with the details of
authentication can nail it down.

My setups (same problem in each)

windows 2000
IIS 5.0
isapi_redirector2.dll binary from apache
j2sdk 1.4.2_03
tomcat 5.0.18


redhat 9 linux
apache 2.0.40
j2sdk 1.4.2_02
tomcat 5.0.16

The problem is in files that have special characters in the name that
require encoding. I discovered it with a file that has a '#' in the name.
For example turtle#2.jpg. This is encodeded to turtle%232.jpg.

I have setup several files to show the problem on my linux server:
Using the redirector: no special characters, no security encoded '#', no security no special characters,
secured with form-based security, user: test, pw:test encoded '#', security.
Close your browser before trying this one to cause the form to display.
After putting in the user and password (you enter them twice for some
reason), it tries to load turtle#2.jpg which fails because # is the special
char for an anchor. It thinks the file is "turtle" with an anchor of "2.jpg"

If you go direct to tomcat, they all work.

The failure only occurs when the file containing the special char is the
first thing loaded from the protected site, so exit the browser of otherwise
invalidate the session to get it to occur. I haven't tested with other
characters to see if they cause problems. My security settings were copied
the ones for jsp-examples/security that comes with 5.0.18

To unsubscribe, e-mail:
For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message