works in Tomcat 6.0.20
http://localhost:8007/sampleweb/params?test=3D%D8
log output
Servlet init
IN doFilter....UTF-8
=D8
=20
U+00D8 =D8 c3 98 LATIN CAPITAL LETTER O WITH STROKE
http://www.utf8-chartable.de/unicode-utf8-table.pl
=20
?
Martin=20
______________________________________________=20
do not alter or disrupt this transmission.
Date: Tue=2C 3 Aug 2010 04:18:36 -0700
From: arunbhat01@yahoo.com
Subject: Re: UTF-8 encoding in Tomcat 6.0
To: users@tomcat.apache.org
Hello Mark
I have tomcat version: apache-tomcat-6.0.29 that i downloaded from:
http://tomcat.apache.org/
As per my understanding=2C this version does not come bundled with any othe=
r components=2C reverse proxies etc? Am i correct?
I wrote a sample application - the source is in SAMPLEWEB-src.zip:
=20
1. unzip sampleweb.zip into the webapps folder of tomcat.
=20
The application is basically just a sample servlet that prints out what it =
gets. There is a filter also attached.
=20
2. Invoke it with a GET:
http://localhost:8080/sampleweb/params?test=3D%D8
The param shows up as ? in the tomcat console=20
Now send it as %25 and we see %D8
=20
3. Invoke it with a post (TestInvoke.html) - entering %D8 in the text field=
and we see it as %D8 only
=20
The result is the same if i use java.net API's ( URLEncoder used to encode =
and decode the characters).
The behavior is the same with and without filters. I can see the %xx charac=
ter in the tomcat console without the filter for POST but not for GET.
=20
Am i sending some parameter wrongly?
Thanks and Regards
Arun
=20
=20
=20
--- On Sun=2C 8/1/10=2C Mark Thomas <markt@apache.org> wrote:
=20
> From: Mark Thomas <markt@apache.org>
> Subject: Re: UTF-8 encoding in Tomcat 6.0
> To: "Tomcat Users List" <users@tomcat.apache.org>
> Date: Sunday=2C August 1=2C 2010=2C 5:05 AM
> On 31/07/2010 17:34=2C arun kumar
> wrote:
> >=20
> > I ran my example webapp on a standalone tomcat and the
> behavior was the same:
> > When the param is being sent using GET=2C I need to send
> the param as %25xx for it to be read correctly
> > When the method is PUT=2C %xx works fine.
>=20
> Then something in your setup is badly broken=2C evidenced by
> the fact you
> have to encode the % as %25 to get things to work.
>=20
> > I believe this is a known issue with Tomcat: I
> remember reading this on many forums. I believe this is the
> same behavior that Erik reports.
>=20
> This is absolutely *not* a Tomcat problem. Tomcat does not
> behave the
> way you describe. A clean Tomcat install with no other
> components
> (reverse proxy etc) using the test encoding JSP from the
> wiki [1] works
> correctly with POST and GET (if URIEncoding=3D"UTF-8" is
> used).
>=20
> > Sorry Mark - i did not get what you said. Could you
> please elaborate?
>=20
> Decoding is happening twice. i.e.:
> %25xx -> %xx
> %xx -> whatever character
>=20
> Tomcat absolutely=2C 100% does not do this. Either your test
> application
> is doing it or there is another component - such as a
> reverse proxy - in
> the mix that is doing a second decoding.
>=20
> This represents a significant security risks. Issues caused
> by double
> decoding in the past include:
> - XSS
> - source code disclosure
> - authentication bypass
> - directory traversal
>=20
> It does not mean that these issues will be present=2C but
> double decoding
> has been the cause of all of these - and probably more - at
> various
> points in the past. The details will depend on system
> configuration but
> seeing an issue like this is certainly indicative that
> there may well be
> a problem.
>=20
> Mark
>=20
> [1] http://wiki.apache.org/tomcat/FAQ/CharacterEncoding#Q4
>=20
>=20
> > Regards
> > Arun
> >=20
> > --- On Sat=2C 7/31/10=2C Mark Thomas <markt@apache.org>
> wrote:
> >=20
> >> From: Mark Thomas <markt@apache.org>
> >> Subject: Re: UTF-8 encoding in Tomcat 6.0
> >> To: "Tomcat Users List" <users@tomcat.apache.org>
> >> Date: Saturday=2C July 31=2C 2010=2C 12:18 PM
> >> On 31/07/2010 15:40=2C arun kumar
> >> wrote:
> >>> Hi Erik
> >>> Thanks very much for your
> responses.
> >>> I can assure that i'm interested in this topic
> even
> >> now :).
> >>>
> >>> My scenario is this:
> >>>
> >>> 1. I use a web application that runs in
> JBOSS.
> >>>
> >>> 2. JBOSS uses a tomcat web container from what
> i can
> >> see.
> >>>
> >>> 3. To my application if i pass a UTF-8 encoded
> value
> >> in hex e.g:
> >>>
> >>
> http://<server>:<port>/<servlet>/param=3D%xx...
> >>>
> >>> Then %xx is not decoded properly. I initially
> used to
> >> send the request with a mozilla browser but later
> started
> >> sending it with a java program as well with the
> same
> >> results.
> >>>
> >>> I tried setting the URI Encoding parameters in
> the
> >> tomcat server.xml - with no success.
> >>> I then set a filter to specifically set the
> encoding
> >> to utf-8 - again with no luck - behavior was
> exactly the
> >> same.
> >>>
> >>> But when i sent the param as %25xx ( %25=3D hex
> value of
> >> the % character)=2C it worked fine but i suspect
> that the
> >> string gets stored in ISO 8859 format - like you
> say: it
> >> gets mangled...
> >>
> >> That smells of double-decoding which as well as
> breaking
> >> your app is
> >> also a security risk. I have seen this when a
> reverse proxy
> >> is in the mix.
> >>
> >> Tomcat will *not* do this on its own.
> >>
> >> Mark
> >>
> >>
> >>
> >>> I wrote a standalone web application that
> showed the
> >> same behavior.
> >>> I haven't tried with a standalone tomcat.
> >>>
> >>> I know that we need to take care of the
> encodings at
> >> various points but how can i rule out a
> problem with
> >> my web container configuration settings? Or can it
> be a
> >> problem coming from the web container itself?
> >>>
> >>> Thanks and regards
> >>> Arun
> >>>
> >>>
> >>> --- On Fri=2C 7/30/10=2C Erik Bunn <ebu@memecry.net>
> >> wrote:
> >>>
> >>>> From: Erik Bunn <ebu@memecry.net>
> >>>> Subject: Re: UTF-8 encoding in Tomcat 6.0
> >>>> To: "Tomcat Users List" <users@tomcat.apache.org>
> >>>> Date: Friday=2C July 30=2C 2010=2C 1:55 PM
> >>>> On 7/30/10 6:33 PM=2C Christopher
> >>>> Schultz wrote:
> >>>>
> >>>>> If all you want to do is set the
> character
> >> encoding=2C
> >>>> you can easily call
> >>>>> setCharacterEncoding and be done with
> it:
> >> subclassing
> >>>> and overriding
> >>>>> should not be necessary at all=2C
> otherwise
> >> nobody would
> >>>> have written one
> >>>>> of these:
> >>>>
> >>>> No=2C I have other reasons to mess there.
> >> Nevertheless=2C
> >>>> adding a filter is
> >>>> probably less iffy=2C thanks for pointing
> that out.
> >> TC7
> >>>> provides a suitable
> >>>> example:
> >>>>
> >>
> .../webapps/examples/WEB-INF/classes/filters/SetCharacterEncodingFilter.j=
ava
> >>>>
> >>>>> Tomcat versions before 7.x had an
> option in
> >>>> the<Connector> which could
> >>>>> be used to set the request URI
> encoding to
> >> that of the
> >>>> Content-Type of
> >>>>> the request (useBodyEncodingForURI)
> and
> >> another option
> >>>> for explicitly
> >>>>> and unconditionally setting the
> encoding to be
> >> used
> >>>> for URI decoding
> >>>>> (URIEncoding). I haven't read-up on
> Tomcat 7
> >>>> behavior.
> >>>>
> >>>> 7.x Connector has the exact same options.
> I'll
> >> restate=2C
> >>>> though=2C that setting
> >>>> the Connector URIEncoding in TC7.x won't
> currently
> >> help
> >>>> when decoding GET
> >>>> parameters in a no-content-type case -
> without the
> >> filter=2C
> >>>> they will be
> >>>> mangled as ISO-8859-1. If this is
> different from
> >> previous
> >>>> behaviour=2C maybe I
> >>>> should report a bug.
> >>>>
> >>>> Thanks=2C
> >>>> //e
> >>>>
> >>>>
> >>>>
> >>
> ---------------------------------------------------------------------
> >>>> To unsubscribe=2C e-mail: users-unsubscribe@tomcat.apache.org
> >>>> For additional commands=2C e-mail: users-help@tomcat.apache.org
> >>>
> >>>
> >>> =20
> >>>
> >>>
> >>
> ---------------------------------------------------------------------
> >>> To unsubscribe=2C e-mail: users-unsubscribe@tomcat.apache.org
> >>> For additional commands=2C e-mail: users-help@tomcat.apache.org
> >>>
> >>
> >>
> >>
> >>
> >>
> ---------------------------------------------------------------------
> >> To unsubscribe=2C e-mail: users-unsubscribe@tomcat.apache.org
> >> For additional commands=2C e-mail: users-help@tomcat.apache.org
> >>
> >>
> >=20
> >=20
> > =20
> >=20
> >
> ---------------------------------------------------------------------
> > To unsubscribe=2C e-mail: users-unsubscribe@tomcat.apache.org
> > For additional commands=2C e-mail: users-help@tomcat.apache.org
> >=20
>=20
>=20
>=20
>=20
> ---------------------------------------------------------------------
> To unsubscribe=2C e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands=2C e-mail: users-help@tomcat.apache.org
>=20
>
=20
=20
=20
--------------------------------------------------------------------- To un=
subscribe=2C e-mail: users-unsubscribe@tomcat.apache.org For additional com=
mands=2C e-mail: users-help@tomcat.apache.org =
|