tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mike Spreitzer" <mspre...@us.ibm.com>
Subject Shouldn't Tomcat 3.2.1 decode the UTF-8 encoding of request parameters?
Date Mon, 19 Feb 2001 22:27:27 GMT
Consider a form that is encoded in UTF-8.  Here's how it comes down:

HTTP/1.0 200 OK
Content-Type: text/html; charset=UTF-8
Servlet-Engine: Tomcat Web Server/3.2.1 (JSP 1.1; Servlet 2.2; Java 1.3.0; 
AIX 4.3 ppc; java.vendor=IBM Corporation)


<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
   "http://www.w3.org/TR/html4/DTD/loose.dtd">
<html>
...
<FORM METHOD=POST ACTION="/servlet/SusrReg">
...
<INPUT NAME="usr" TYPE=text SIZE="20">
...

I fill in the "usr" field with a single character, U+201D, and submit. 
Here's how the submission goes up:

POST /servlet/SusrReg HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/x-comet, application/pdf, */*
Referer: http://9.2.43.70:8085/servlet/SusrReg
Accept-Language: en-us
Content-Type: application/x-www-form-urlencoded
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)
Host: 9.2.43.70:8085
Content-Length: 165
Connection: Keep-Alive
Cookie: JSESSIONID=loj2w5hcz1

usr=%E2%80%9D&B1=Submit

In my servlet, I find the value of the request parameter named "usr" is a 
string of three characters: U+00E2, U+0080, U+009D.  Should I be offended, 
or expect that the servlet should have to decode the UTF-8?  I find the 
servlet spec v2.2 fairly silent on the issue, leading me to expect that 
the servlet container is supposed to handle the full parameter decoding.

Thanks,
Mike


Mime
View raw message