tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From nch <>
Subject Re: Character encoding
Date Wed, 18 Jun 2008 19:09:39 GMT

More info on this:

- I do remote debugging through Eclipse to both tomcat on windows (same machine as eclipse,
though) and tomcat on debian.

- I open a debugging port on tomcat by setting CATALINA_OPTS=-Xmx1024m -Xdebug -Xnoagent -Djava.compiler=NONE

- When I send "piraña" it is allways encoded into the URL as "pira%C3%B1a", whether running
tomcat on windows, debian or even running my app into Jetty.

- When I send "piraña", if I'm debugging tomcat on windows I can read "piraña".

- If tomcat is running on debian, I read "piraña".

- If I type "piraña" on and switch browser encoding
display between ISO-8859-1 and UTF-8, I can see that when ISO-8859-1, then it displays "piraña",
when UTF-8, it displays "piraña".

- When I run/debug my app on Jetty I get "piraña" (I've read on the web that Jetty decodes
to UTF-8 by default).

- Something could be wrong in my debian environment. How can I find out about which env. varables
is tomcat using?

- If I try to manually decode the returned parameter into my controller
by using URLDecoder.decode(query, "UTF-8") then I can see no
difference. That is, when debugging the tomcat on windows the result is
"piraña" while debugging the one on debian the result is "piraña".

- Is URLDecoder#decode environment dependent?

Hope this is useful. Lots of thanks to you all.

----- Original Message ----
From: Christopher Schultz <>
To: Tomcat Users List <>
Sent: Wednesday, June 18, 2008 7:25:03 PM
Subject: Re: Character encoding

Hash: SHA1


nch wrote:
| I have a form that has an input field named "query". I type "piraña"
| an submit the form using the GET method. I can see the browser has
| encoded this parameter into the URI as query=pira%C3%B1a

Is this a correct UTF-8 encoding of the parameter? I don't have my
unicode conversion chart handy right now.

| I set a breakpoint

Stop right there. If you are executing TC through a debugger, are you
sure that it is using its standard server.xml configuration?

| into the filter so when the request hits the filter I can see
| getCharacterEncoding() returns null. The filters sets it to "UTF-8".

FYI, this has no bearing on the interpretation of the URI.

| Then the request gets to the controller where I can see the request
| parameter "query" is set to "piraña".

Just in case it doesn't go through email very well, I see "pir" followed
by an A with a tilde over it, followed by a +/- symbol, followed by an
"a". Definitely not right. Is that what you'd expect if you improperly
interpreted the UTF-8, URL-encoded "piraña" as if it were ISO-8859-1?

- -chris
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla -


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message