tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hans Bergsten <>
Subject Re: Justification for URIEncoding addition?
Date Fri, 21 Nov 2003 18:54:41 GMT
Remy Maucherat wrote:
> Larry Isaacs wrote:
>> Hi Remy,
>> Okay, re-reviewed the original 22666 thread.  To complete this thread,
>> I'll assume the following from RFC2718 is our justification for the
>> new behavior:
>>       Unless there is some compelling reason for a
>>       particular scheme to do otherwise, translating character
>>       sequences into UTF-8 (RFC 2279) [3] and then subsequently
>>       using the %HH encoding for unsafe octets is recommended.
>> Tomcat will default to US-ASCII instead of UTF-8 so it won't break
>> too many existing webapps.  If there are other parts to this story,
>> I would be interested in learning of them.
>> I'm still concerned that this makes Tomcat less useful by creating
>> deployment problems for webapps that aren't technically broken.
>> However, these issues were covered in the prior e-mail thread
>> (, 
>> so I'll drop the issue.  Thanks.
> The idea for the change is that there's no compelling reason (except 
> hacking) to have one part of the URI be in some encoding (US-ASCII or 
> UTF-8, if you want to have any chance of mapping it successfully), and 
> the rest encoded in something else.
> There's indeed a bug thread on this issue, and I was on your side at first.

I've browsed through the thread referenced above as well as the comments
on bug 22666. Sorry if I'm missing something here, but to me it seems
like what Craig did for TC 4.x is the solution that's less harmful,
namely let Jasper get the "jsp_precompile" parameter by scanning the
getQueryString() result instead of using getParameter().

It's clear that enforcing the RFC2718 recommendation breaks a lot of
apps (based on all the bug reports and questions about this), and AFAIK,
most commonly used browsers (or all of them) use the encoding of the
page to encode parameters in both the body and the query string. It
therefore seems reasonable to use the setCharacterEncoding() value to
decode both types of parameters (at least as a default) and fix 22666
by avoiding the premature call to getParameter() that Jasper does in
the same way as it's done in TC 4.

My applogies if I missed a part of the thread that discussed this
solution and found it flawed.

Hans Bergsten                                <>
Gefion Software                       <>
Author of O'Reilly's "JavaServer Pages", covering JSP 2.0 and JSTL 1.1
Details at                                    <>

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message