tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Gainty <mgai...@hotmail.com>
Subject RE: UTF-8 encoding in Tomcat 6.0
Date Tue, 03 Aug 2010 13:47:53 GMT

works in Tomcat 6.0.20
http://localhost:8007/sampleweb/params?test=%D8

log output

Servlet init
IN doFilter....UTF-8
Ø
 
U+00D8 Ø c3 98 LATIN CAPITAL LETTER O WITH STROKE
http://www.utf8-chartable.de/unicode-utf8-table.pl
 
?
Martin 
______________________________________________ 
do not alter or disrupt this transmission.





Date: Tue, 3 Aug 2010 04:18:36 -0700
From: arunbhat01@yahoo.com
Subject: Re: UTF-8 encoding in Tomcat 6.0
To: users@tomcat.apache.org

Hello Mark
      I have  tomcat version: apache-tomcat-6.0.29 that i downloaded from:
http://tomcat.apache.org/
As per my understanding, this version does not come bundled with any other components, reverse
proxies etc? Am i correct?
I wrote a sample application - the source is in SAMPLEWEB-src.zip:
 
1. unzip sampleweb.zip into the webapps folder of tomcat.
 
The application is basically just a sample servlet that prints out what it gets. There is
a filter also attached.
 
2. Invoke it with a GET:
http://localhost:8080/sampleweb/params?test=%D8
The param shows up as ? in the tomcat console 
Now send it as %25 and we see %D8
 
3. Invoke it with a post (TestInvoke.html) - entering %D8 in the text field and we see it
as %D8 only
 
The result is the same if i use java.net API's ( URLEncoder used to encode and decode the
characters).
The behavior is the same with and without filters. I can see the %xx character in the tomcat
console without the filter for POST but not for GET.
 
Am i sending some parameter wrongly?
Thanks and Regards
Arun
 
 
 
--- On Sun, 8/1/10, Mark Thomas <markt@apache.org> wrote:
 
> From: Mark Thomas <markt@apache.org>
> Subject: Re: UTF-8 encoding in Tomcat 6.0
> To: "Tomcat Users List" <users@tomcat.apache.org>
> Date: Sunday, August 1, 2010, 5:05 AM
> On 31/07/2010 17:34, arun kumar
> wrote:
> > 
> > I ran my example webapp on a standalone tomcat and the
> behavior was the same:
> > When the param is being sent using GET, I need to send
> the param as %25xx for it to be read correctly
> > When the method is PUT, %xx works fine.
> 
> Then something in your setup is badly broken, evidenced by
> the fact you
> have to encode the % as %25 to get things to work.
> 
> > I believe this is a known issue with Tomcat: I
> remember reading this on many forums. I believe this is the
> same behavior that Erik reports.
> 
> This is absolutely *not* a Tomcat problem. Tomcat does not
> behave the
> way you describe. A clean Tomcat install with no other
> components
> (reverse proxy etc) using the test encoding JSP from the
> wiki [1] works
> correctly with POST and GET (if URIEncoding="UTF-8" is
> used).
> 
> > Sorry Mark - i did not get what you said. Could you
> please elaborate?
> 
> Decoding is happening twice. i.e.:
> %25xx -> %xx
> %xx -> whatever character
> 
> Tomcat absolutely, 100% does not do this. Either your test
> application
> is doing it or there is another component - such as a
> reverse proxy - in
> the mix that is doing a second decoding.
> 
> This represents a significant security risks. Issues caused
> by double
> decoding in the past include:
> - XSS
> - source code disclosure
> - authentication bypass
> - directory traversal
> 
> It does not mean that these issues will be present, but
> double decoding
> has been the cause of all of these - and probably more - at
> various
> points in the past. The details will depend on system
> configuration but
> seeing an issue like this is certainly indicative that
> there may well be
> a problem.
> 
> Mark
> 
> [1] http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q4
> 
> 
> > Regards
> > Arun
> > 
> > --- On Sat, 7/31/10, Mark Thomas <markt@apache.org>
> wrote:
> > 
> >> From: Mark Thomas <markt@apache.org>
> >> Subject: Re: UTF-8 encoding in Tomcat 6.0
> >> To: "Tomcat Users List" <users@tomcat.apache.org>
> >> Date: Saturday, July 31, 2010, 12:18 PM
> >> On 31/07/2010 15:40, arun kumar
> >> wrote:
> >>> Hi Erik
> >>>    Thanks very much for your
> responses.
> >>> I can assure that i'm interested in this topic
> even
> >> now :).
> >>>
> >>> My scenario is this:
> >>>
> >>> 1. I use a web application that runs in
> JBOSS.
> >>>
> >>> 2. JBOSS uses a tomcat web container from what
> i can
> >> see.
> >>>
> >>> 3. To my application if i pass a UTF-8 encoded
> value
> >> in hex e.g:
> >>>
> >>
> http://<server>:<port>/<servlet>/param=%xx...
> >>>
> >>> Then %xx is not decoded properly. I initially
> used to
> >> send the request with a mozilla browser but later
> started
> >> sending it with a java program as well with the
> same
> >> results.
> >>>
> >>> I tried setting the URI Encoding parameters in
> the
> >> tomcat server.xml - with no success.
> >>> I then set a filter to specifically set the
> encoding
> >> to utf-8 - again with no luck - behavior was
> exactly the
> >> same.
> >>>
> >>> But when i sent the param as %25xx ( %25= hex
> value of
> >> the % character), it worked fine but i suspect
> that the
> >> string gets stored in ISO 8859 format - like you
> say: it
> >> gets mangled...
> >>
> >> That smells of double-decoding which as well as
> breaking
> >> your app is
> >> also a security risk. I have seen this when a
> reverse proxy
> >> is in the mix.
> >>
> >> Tomcat will *not* do this on its own.
> >>
> >> Mark
> >>
> >>
> >>
> >>> I wrote a standalone web application that
> showed the
> >> same behavior.
> >>> I haven't tried with a standalone tomcat.
> >>>
> >>> I know that we need to take care of the
> encodings at
> >> various points but how can i rule out  a
> problem with
> >> my web container configuration settings? Or can it
> be a
> >> problem coming from the web container itself?
> >>>
> >>> Thanks and regards
> >>> Arun
> >>>
> >>>
> >>> --- On Fri, 7/30/10, Erik Bunn <ebu@memecry.net>
> >> wrote:
> >>>
> >>>> From: Erik Bunn <ebu@memecry.net>
> >>>> Subject: Re: UTF-8 encoding in Tomcat 6.0
> >>>> To: "Tomcat Users List" <users@tomcat.apache.org>
> >>>> Date: Friday, July 30, 2010, 1:55 PM
> >>>> On 7/30/10 6:33 PM, Christopher
> >>>> Schultz wrote:
> >>>>
> >>>>> If all you want to do is set the
> character
> >> encoding,
> >>>> you can easily call
> >>>>> setCharacterEncoding and be done with
> it:
> >> subclassing
> >>>> and overriding
> >>>>> should not be necessary at all,
> otherwise
> >> nobody would
> >>>> have written one
> >>>>> of these:
> >>>>
> >>>> No, I have other reasons to mess there.
> >> Nevertheless,
> >>>> adding a filter is
> >>>> probably less iffy, thanks for pointing
> that out.
> >> TC7
> >>>> provides a suitable
> >>>> example:
> >>>>
> >>
> .../webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.java
> >>>>
> >>>>> Tomcat versions before 7.x had an
> option in
> >>>> the<Connector>  which could
> >>>>> be used to set the request URI
> encoding to
> >> that of the
> >>>> Content-Type of
> >>>>> the request (useBodyEncodingForURI)
> and
> >> another option
> >>>> for explicitly
> >>>>> and unconditionally setting the
> encoding to be
> >> used
> >>>> for URI decoding
> >>>>> (URIEncoding). I haven't read-up on
> Tomcat 7
> >>>> behavior.
> >>>>
> >>>> 7.x Connector has the exact same options.
> I'll
> >> restate,
> >>>> though, that setting
> >>>> the Connector URIEncoding in TC7.x won't
> currently
> >> help
> >>>> when decoding GET
> >>>> parameters in a no-content-type case -
> without the
> >> filter,
> >>>> they will be
> >>>> mangled as ISO-8859-1. If this is
> different from
> >> previous
> >>>> behaviour, maybe I
> >>>> should report a bug.
> >>>>
> >>>> Thanks,
> >>>> //e
> >>>>
> >>>>
> >>>>
> >>
> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> >>>> For additional commands, e-mail: users-help@tomcat.apache.org
> >>>
> >>>
> >>>        
> >>>
> >>>
> >>
> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> >>> For additional commands, e-mail: users-help@tomcat.apache.org
> >>>
> >>
> >>
> >>
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> >> For additional commands, e-mail: users-help@tomcat.apache.org
> >>
> >>
> > 
> > 
> >       
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands, e-mail: users-help@tomcat.apache.org
> > 
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
> 
>
 
 
      

--------------------------------------------------------------------- To unsubscribe, e-mail:
users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org
		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message