tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 29900] - request params in utf-8 corrupted
Date Sat, 03 Jul 2004 15:28:02 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=29900>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=29900

request params in utf-8 corrupted

markt@apache.org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID



------- Additional Comments From markt@apache.org  2004-07-03 15:28 -------
TC5 no longer defaults to using the body encoding for parameters as this is 
not spec compliant. See my standard text on encoding (attached below) for more 
info.


REQUESTS
========

There are a number of situations where there may be a requirement to use non-
US ASCII characters in a URI. These include:
- Parameters in the query string
- Servlet paths

There is a standard for encoding URIs (http://www.w3.org/International/O-URL-
code.html) but this standard is not consistently followed by clients. This 
causes a number of problems.

The functionality provided by Tomcat (4 and 5) to handle this less than ideal 
situation is described below.

1. The Coyote HTTP/1.1 connector has a useBodyEncodingForURI attribute which 
if set to true will use the request body encoding to decode the URI query 
parameters.
  - The default value is true for TC4 (breaks spec but gives consistent 
behaviour across TC4 versions)
  - The default value is false for TC5 (spec compliant but there may be 
migration issues for some apps)
2. The Coyote HTTP/1.1 connector has a URIEncoding attribute which defaults to 
ISO-8859-1.
3. The parameters class (o.a.t.u.http.Parameters) has a QueryStringEncoding 
field which defaults to the URIEncoding. It must be set before the parameters 
are parsed to have an effect.

Things to note regarding the servlet API:
1. HttpServletRequest.setCharacterEncoding() normally only applies to the 
request body NOT the URI.
2. HttpServletRequest.getPathInfo() is decoded by the web container.
3. HttpServletRequest.getRequestURI() is not decoded by container.

Other tips:
1. Use POST with forms to return parameters as the parameters are then part of 
the request body.


RESPONSES
=========

HTML META
 tags are ignored by Tomcat. You may use <%@ page pagEncoding="..." %> for 
JSPs.

---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Mime
View raw message