tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: UTF-8 handling differs between two servlets within the same application
Date Mon, 23 Jun 2008 22:36:56 GMT

Mark Thomas wrote:
> I tend to use the following as a starting point to check my config is 
> OK. It is also useful to compare headers etc for your application 
> against the headers from this simple test case.
This is a bit outside the scope of this thread, but as someone 
confronted with this kind of character sets issues in the web all the 
time, I feel I have to say that the comment at the beginning of that 
example can be misleading, and in my view should be taken out.

It is of a nature to induce people into doing things they should not, 
and which would always bite them back in the end.
(For the same reason, I believe that all the methods or parameters 
dealing with "URI encoding" should be banned).

I can make a long case, but the summary is : don't use GET with forms, 
if you want to have any luck with applications that may have to handle 
input characters other than US-ASCII (as all web applications will have 
to, sooner or later; think of smileys).
The situation is already confusing enough with POSTed forms, without 
adding extra problem sources.

The HTML 4.01 spec (and, I suspect, the XHTML also) mentions this as 
follows, in the same RFC, same section :

Note. The "get" method restricts form data set values to ASCII 
characters. Only the "post" method (with enctype="multipart/form-data") 
is specified to cover the entire [ISO10646] character set.

17.13.4 Form content types )
Also see RFC3986.


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message