tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <>
Subject Re: Tomcat5.0.28 character encodingg problem
Date Wed, 25 Jul 2007 16:09:06 GMT
Hash: SHA1


Joe Russo wrote:
> I am in the process of converting from using JRUN to Tomcat

Good for you! Welcome to the community.

> I have
> ran into the problem where these funky symbols are displaying.  I can
> not find any stack traces that would explain or possibly clue into a
> solution.  

Right. These things (encoding problems) hardly ever generate errors;
they just exhibit unexpected behavior.

> My questions are:  
> Does Tomcat have problems with any types of encoding?      

Yes and no. Tomcat behaves exactly as the HTTP specification mandates.
That is, it interprets all incoming data using the ISO-8859-1 character
encoding unless the request states otherwise (in the Content-Type
header). Some browsers don't send the encoding along with the
Content-Type, so the behavior gets confused.

Some browsers only send an encoding when there is POST data, since the
Content-Type only really makes sense when where is request content (the
POST data). Unfortunately, the browser usually uses (what would have
been) the Content-Type of a request to encode the URL in the request.
So, if a browser uses UTF-8 to encode the URL (which is typical these
days), but doesn't send a Content-Type header (or leaves out the
encoding), then Tomcat interprets it incorrectly as ISO-8859-1, and you
get funny characters.

It's not Tomcat's fault. It's actually not the browser's fault, either.
It's actually the HTTP spec's fault, since the character encoding used
in URLs isn't explicitly laid out. :(

> What type of characters are being displayed below and any advice in
> troubleshooting or solving this would be gratefully appreciated.

The presence of the 'รข' character looks to me like a UTF-8 URL being
interpreted as an ISO-8859-1 URL. Try searching google for
CharacterEncodingFilter and take a look at that. It tries to recover
from requests that don't include a character encoding. You should also
look at the "URIEncoding" attribute of the <Connector> element. You can
set the encoding to something other than the default (ISO-8859-1).

For more information, see: (if you use JK) (if you don't)

- -chris
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla -


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message