tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bugzi...@apache.org
Subject DO NOT REPLY [Bug 43848] New: - SetCharaccterEncoding and Cached Parameters
Date Mon, 12 Nov 2007 20:18:14 GMT
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG·
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://issues.apache.org/bugzilla/show_bug.cgi?id=43848>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND·
INSERTED IN THE BUG DATABASE.

http://issues.apache.org/bugzilla/show_bug.cgi?id=43848

           Summary: SetCharaccterEncoding and Cached Parameters
           Product: Tomcat 5
           Version: 5.5.24
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: Unknown
        AssignedTo: tomcat-dev@jakarta.apache.org
        ReportedBy: hceylan@batoo.org


Hello,

I have discovered a behavior which I consider as a bug and would request it to
be reviewed.

Let me start by the source of the problem I had. We have English / Turkish
enabled web site that uses UTF-8 charset. I have a Filter that sets the encoding
to UTF-8 for all the request coming to tomcat to correctly handle not ascii
standard Turkish characters. However I saw that the
request.setCharachterEncoding(encoding) would not changed anything. It used to
work before and I suspected of a bug in my code.

Having spent 4 hours and trying different encodings, (UTF-8, ISO-8859-9,
ISO-8859-1) my forms with the Turkish characters stayed as corrupted.

Then to investigate it further, I dived into tomcat codes.

Then I discovered the following behavior:   
At any stage when the request's parameters is first accessed, all the parameters
parsed and cached.  There is nothing wrong with that. However, having accessed a
parameter at least once, if you try to set/change character encoding, encoding
is set without any warning/error/exception but is not effective as it directly
used at the time of parsing the parameters.

In my specific case, I have a filter that dumps all the properties of the
request (URL, headers, cookies, parameters and session attributes of the request
) for ease of development and this is the very first filter that a request
encounters, which accesses parameters of the request thus triggers caching of
the parameters.

Then the second filter sets the UTF-8 encoding to the request which has NO EFFECT.

Having searched the bug database, I have seen a lot of "encoding ignored" bugs.
I think a considerable amount of them might be related. I think the correct
behavior should be either expiration of the parsed parameter information cache
on encoding changes, or an exception needs to be thrown to indicate "it is too
late to change the encoding"

Unfortunately I haven't got the time check the servlet spec. So I may be
conflicting with the spec in my recommendations and if so my apologies for
misleading recommendations.

By the way I have tested this against 5.5.25 which is not listed in bugzilla!! 

Regards,
Hasan Ceylan

-- 
Configure bugmail: http://issues.apache.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message