tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Spreitzer" <>
Subject Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?
Date Mon, 19 Feb 2001 22:27:27 GMT
Consider a form that is encoded in UTF-8.  Here's how it comes down:

HTTP/1.0 200 OK
Content-Type: text/html; charset=UTF-8
Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; Java 1.3.0; 
AIX 4.3 ppc; java.vendor=IBM Corporation)

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
<FORM METHOD=POST ACTION="/servlet/SusrReg">
<INPUT NAME="usr" TYPE=text SIZE="20">

I fill in the "usr" field with a single character, U+201D, and submit. 
Here's how the submission goes up:

POST /servlet/SusrReg HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-comet, application/pdf, */*
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
Content-Length: 165
Connection: Keep-Alive
Cookie: JSESSIONID=loj2w5hcz1


In my servlet, I find the value of the request parameter named "usr" is a 
string of three characters: U+00E2, U+0080, U+009D.  Should I be offended, 
or expect that the servlet should have to decode the UTF-8?  I find the 
servlet spec v2.2 fairly silent on the issue, leading me to expect that 
the servlet container is supposed to handle the full parameter decoding.


View raw message