tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Remy Maucherat <>
Subject Re: Justification for URIEncoding addition?
Date Tue, 25 Nov 2003 07:33:23 GMT
Hans Bergsten wrote:
> Remy Maucherat wrote:
>> Hans Bergsten wrote:
>>> Larry Isaacs wrote:
>>>> Hans,
>>>> The behavior change is unrelated to the use of getParameter()
>>>> to search for "jsp_precompile".  Both Tomcat 3.x and Tomcat 4.x
>>>> were bit by this log ago and Craig's fix was applied to both.
>>>> In Tomcat 4's case, it was prior to the 4.0 release.
>>> Okay, I'm sure you're right that there may be more to it than
>>> avoiding the getParameter() call in Jasper, but based on what
>>> I've read, it seems to be part of the problem at least.
>>>> Assuming I have a good grip on the issue, I think it relates
>>>> to using UTF-8 to decode the path portion of the URL which
>>>> gets used to determine context, servlet mapping, etc.  Then
>>>> allowing setCharacterEncoding() to change the character encoding
>>>> for the query portion of the same URL.  The Servlet 2.3 and 2.4
>>>> specs both say setCharacterEncoding() applies to the request body
>>>> but don't mention it applying to the query portion of the URL.
>>> Right, but since the servlet spec doesn't say anything about encoding
>>> for the query portion, I think we have some room for a sensible
>>> interpretation.
>>> My concern is that with the new decoding behavior, apps that used to
>>> work fine suddenly don't, and the reason seems to be that browsers
>>> in fact ignore the RFC2718 recommendation that TC now enforces. I'm
>>> all for compliance with all related specs, but in this case it's just
>>> a recommendation and following it seems to do more harm than good.
>>> I agree it's not as clean as you may want, but are there any real
>>> problems with decoding the path portion using one charset and the
>>> query string with another (i.e., the one from getCharacterEncoding()),
>>> the way it used to be done?
>> I see you as a member of the expert group for the servlet spec. Did 
>> you make out those points during the review period ? If not, then you 
>> IMO have nothing to complain about, esp since Tomcat implements a far 
>> more reasonable and simpler behavior for the URL string handling.
> Remy, I'm not complaining, I'm just trying to help with ideas for how
> solve a problem that apparently affects a lot of people. Sigh!
> Yes, I'm in the servlet spec EG and I did help solving other i18n
> problems by bringing together all the spec leads for servlets, JSP and
> JSTL and Sun's i18n guru to fix inconsistencies between these specs.
> Unfortunately, I missed the query string encoding problem, largely
> because the way TC handled it before the recent change seemed to work
> for most apps so I hadn't encountered the problem. My bad.
>> The specification should have specified something along the lines of:
>> - The URL must be %xx encoded
>> - This decodes to bytes reprensenting UTF-8 characters
>> There's an IETF standard that, I think, states this in B&W. It is 
>> being ignored. Maybe this wouldn't be the case if very popular tech, 
>> such as servlets & JSPs, started mandating it ? This is simply a 
>> chiken & egg issue.
> And because its a chicken and egg problem, I doubt that it will ever be
> solved. No server vendor is likely to change the behavior in a way
> that's incompatible with a large set of browsers. A more sensible way
> to solve this would be for W3C to change the spec to require the
> behavior most browsers already implement, even if it's less elegant.
>> i18n issues with HTTP and srevlets have been known about for years, 
>> but unfortunately they still haven't been addressed properly.
>> Same with the request dispatcher + wrapping issues that I have pointed 
>> out months ago (and of course, were silently ignored).
>> To balance this a little, among the other big issues, I have to give 
>> credit for solving the welcome files in a satisfactory way, as well as 
>> filters with RDs. Filters now make the proprietary APIs provided by 
>> the container irrelevant for most tasks.
> I'm glad you like something in the new spec ;-) Although, there's more
> to be done with the welcome file mechanism. I tried to get it all done
> in 2.4, but we couldn't reach consensus so what there now is still too
> vague, IMHO.

I think I saw your proposal (at least I saw a radical refactoring of the 
feature), and I was *really* mad about it. I'm very very glad it got killed.

Sorry, but I completely disagree with *all* the points you make in this 


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message