tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Endre Stølsvik <En...@Stolsvik.com>
Subject Re: RequestDumperValve screws UTF-8 parameter parsing
Date Tue, 10 Jan 2006 09:16:34 GMT
On Tue, 10 Jan 2006, Oded Arbel wrote:

| On Tuesday, 10 ÿÿJanuary 2006 00:06, Endre Stølsvik wrote:
| > Enabling the RequestDumperValve in both 5.5.12 and 5.0.16 (!) messes
| > up the parsing of other-than-ISO-8859-1 incoming parameters.
| >
| > After using a rather huge bunch of hours, this came down as the
| > result: when this "debug valve" is turned on, it seems to default to
| > ISO-8859-1 when it parses and log-outputs the incoming parameters,
| > thus also implicitly setting the entire Request-object to this enc,
| > so any subsequnt setting to UTF-8 doesn't matter at all. At least
| > this is true for POST paramters.
| 
| AFAIK, the catalina implementation of HttpServletRequest does not allow 
| to set the character set more then once, even though it doesn't do any 
| pre-processing of the input.
| 
| Maybe that should be fixed instead ?

I think that when you "touch" the servlet request object's parameters at 
all (or even anything else it might seem like), it parses them all at once 
using the then-set (or default) encoding and caches them.

This is most probably according to spec.

I really don't find this that problematic, as it also ensures that you 
code in the most efficient way: Letting you set the encoding (differently) 
multiple times would definately ensure a slower processing, and would also 
most probably simply be a bug: the browser sends all its parameters using 
one encoding, and it would be a strange setting if it was really needed to 
change the encoding "midways" in your processing.

I feel the problem here is a) i18n ignorance, b) bad coding logic, and c) 
bad documentation/commenting of a debugging feature.

Regards,
Endre.

Mime
View raw message