tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Thomas <ma...@apache.org>
Subject Re: UTF-8 encoding in Tomcat 6.0
Date Sat, 31 Jul 2010 16:18:10 GMT
On 31/07/2010 15:40, arun kumar wrote:
> Hi Erik
>   Thanks very much for your responses.
> I can assure that i'm interested in this topic even now :).
> 
> My scenario is this:
> 
> 1. I use a web application that runs in JBOSS.
> 
> 2. JBOSS uses a tomcat web container from what i can see.
> 
> 3. To my application if i pass a UTF-8 encoded value in hex e.g:
> http://<server>:<port>/<servlet>/param=%xx...
> 
> Then %xx is not decoded properly. I initially used to send the request with a mozilla
browser but later started sending it with a java program as well with the same results.
> 
> I tried setting the URI Encoding parameters in the tomcat server.xml - with no success.
> I then set a filter to specifically set the encoding to utf-8 - again with no luck -
behavior was exactly the same.
> 
> But when i sent the param as %25xx ( %25= hex value of the % character), it worked fine
but i suspect that the string gets stored in ISO 8859 format - like you say: it gets mangled...

That smells of double-decoding which as well as breaking your app is
also a security risk. I have seen this when a reverse proxy is in the mix.

Tomcat will *not* do this on its own.

Mark



> I wrote a standalone web application that showed the same behavior.
> I haven't tried with a standalone tomcat.
> 
> I know that we need to take care of the encodings at various points but how can i rule
out  a problem with my web container configuration settings? Or can it be a problem coming
from the web container itself?
> 
> Thanks and regards
> Arun
> 
> 
> --- On Fri, 7/30/10, Erik Bunn <ebu@memecry.net> wrote:
> 
>> From: Erik Bunn <ebu@memecry.net>
>> Subject: Re: UTF-8 encoding in Tomcat 6.0
>> To: "Tomcat Users List" <users@tomcat.apache.org>
>> Date: Friday, July 30, 2010, 1:55 PM
>> On 7/30/10 6:33 PM, Christopher
>> Schultz wrote:
>>
>>> If all you want to do is set the character encoding,
>> you can easily call
>>> setCharacterEncoding and be done with it: subclassing
>> and overriding
>>> should not be necessary at all, otherwise nobody would
>> have written one
>>> of these:
>>
>> No, I have other reasons to mess there. Nevertheless,
>> adding a filter is
>> probably less iffy, thanks for pointing that out. TC7
>> provides a suitable
>> example:
>> .../webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java
>>
>>> Tomcat versions before 7.x had an option in
>> the<Connector>  which could
>>> be used to set the request URI encoding to that of the
>> Content-Type of
>>> the request (useBodyEncodingForURI) and another option
>> for explicitly
>>> and unconditionally setting the encoding to be used
>> for URI decoding
>>> (URIEncoding). I haven't read-up on Tomcat 7
>> behavior.
>>
>> 7.x Connector has the exact same options. I'll restate,
>> though, that setting
>> the Connector URIEncoding in TC7.x won't currently help
>> when decoding GET
>> parameters in a no-content-type case - without the filter,
>> they will be
>> mangled as ISO-8859-1. If this is different from previous
>> behaviour, maybe I
>> should report a bug.
>>
>> Thanks,
>> //e
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
> 
> 
>       
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message