tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hans Bergsten <h...@gefionsoftware.com>
Subject Re: Justification for URIEncoding addition?
Date Tue, 25 Nov 2003 01:23:49 GMT
Remy Maucherat wrote:
> Hans Bergsten wrote:
> 
>> Larry Isaacs wrote:
>>
>>> Hans,
>>>
>>> The behavior change is unrelated to the use of getParameter()
>>> to search for "jsp_precompile".  Both Tomcat 3.x and Tomcat 4.x
>>> were bit by this log ago and Craig's fix was applied to both.
>>> In Tomcat 4's case, it was prior to the 4.0 release.
>>
>>
>>
>> Okay, I'm sure you're right that there may be more to it than
>> avoiding the getParameter() call in Jasper, but based on what
>> I've read, it seems to be part of the problem at least.
>>
>>> Assuming I have a good grip on the issue, I think it relates
>>> to using UTF-8 to decode the path portion of the URL which
>>> gets used to determine context, servlet mapping, etc.  Then
>>> allowing setCharacterEncoding() to change the character encoding
>>> for the query portion of the same URL.  The Servlet 2.3 and 2.4
>>> specs both say setCharacterEncoding() applies to the request body
>>> but don't mention it applying to the query portion of the URL.
>>
>>
>>
>> Right, but since the servlet spec doesn't say anything about encoding
>> for the query portion, I think we have some room for a sensible
>> interpretation.
>>
>> My concern is that with the new decoding behavior, apps that used to
>> work fine suddenly don't, and the reason seems to be that browsers
>> in fact ignore the RFC2718 recommendation that TC now enforces. I'm
>> all for compliance with all related specs, but in this case it's just
>> a recommendation and following it seems to do more harm than good.
>>
>> I agree it's not as clean as you may want, but are there any real
>> problems with decoding the path portion using one charset and the
>> query string with another (i.e., the one from getCharacterEncoding()),
>> the way it used to be done?
> 
> 
> I see you as a member of the expert group for the servlet spec. Did you 
> make out those points during the review period ? If not, then you IMO 
> have nothing to complain about, esp since Tomcat implements a far more 
> reasonable and simpler behavior for the URL string handling.

Remy, I'm not complaining, I'm just trying to help with ideas for how
solve a problem that apparently affects a lot of people. Sigh!

Yes, I'm in the servlet spec EG and I did help solving other i18n
problems by bringing together all the spec leads for servlets, JSP and
JSTL and Sun's i18n guru to fix inconsistencies between these specs.
Unfortunately, I missed the query string encoding problem, largely
because the way TC handled it before the recent change seemed to work
for most apps so I hadn't encountered the problem. My bad.

> The specification should have specified something along the lines of:
> - The URL must be %xx encoded
> - This decodes to bytes reprensenting UTF-8 characters
> There's an IETF standard that, I think, states this in B&W. It is being 
> ignored. Maybe this wouldn't be the case if very popular tech, such as 
> servlets & JSPs, started mandating it ? This is simply a chiken & egg 
> issue.

And because its a chicken and egg problem, I doubt that it will ever be
solved. No server vendor is likely to change the behavior in a way
that's incompatible with a large set of browsers. A more sensible way
to solve this would be for W3C to change the spec to require the
behavior most browsers already implement, even if it's less elegant.

> i18n issues with HTTP and srevlets have been known about for years, but 
> unfortunately they still haven't been addressed properly.
> Same with the request dispatcher + wrapping issues that I have pointed 
> out months ago (and of course, were silently ignored).
> 
> To balance this a little, among the other big issues, I have to give 
> credit for solving the welcome files in a satisfactory way, as well as 
> filters with RDs. Filters now make the proprietary APIs provided by the 
> container irrelevant for most tasks.

I'm glad you like something in the new spec ;-) Although, there's more
to be done with the welcome file mechanism. I tried to get it all done
in 2.4, but we couldn't reach consensus so what there now is still too
vague, IMHO.

Hans
-- 
Hans Bergsten                                <hans@gefionsoftware.com>
Gefion Software                       <http://www.gefionsoftware.com/>
Author of O'Reilly's "JavaServer Pages", covering JSP 2.0 and JSTL 1.1
Details at                                    <http://TheJSPBook.com/>


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-dev-help@jakarta.apache.org


Mime
View raw message