tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Tomcat 6 encoding issue
Date Thu, 12 Nov 2009 08:42:59 GMT
pramodpm wrote:
> We are facing an encoding issue in apache-tomcat-6.0.20. This is working in
> tomcat 5.5.23.  We are trying to  make a get request to external site. The
> page contains some utf-8 characters. 

No.  The page probably contains Unicode characters, all encoded in the 
UTF-8 encoding.  What you probably mean is that some of these characters 
have a Unicode codepoint above 127 decimal, and are thus represented by 
2 or more bytes in UTF-8.

  When we access the page from the
> application we are getting the following error. 
> Can you please help us to resolve this issue. Any help is appreciated.
 From your log below, it does not look like you have problems when 
accessing the external page.  Reading the page is fine, and the content 
of the page is being properly translated, from its original UTF-8 
encoding, into a Unicode string in Java (in your servlet).

However, what happens next is that your servelt is trying to output this 
string to the servlet output stream, which is specified as having the 
ISO-8859-1 charset/encoding.  And at least one of these internal Unicode 
characters does not have a valid representation in ISO-8859-1.  So Java 
complains at the moment you are trying to write out this character, 
because it cannot translate it from the internal Unicode, to the 
external desired ISO-8859-1 (because that particular character does not 
exist in ISO-8859-1 (which contains only the 256 characters that are 
part of the latin-1 set, which covers only some Western European languages).

Now having written all that, I am still a bit uneasy, if the <83> below 
represents the hexadecimal Unicode codepoint of this character.  Because 
0083 is a character known as "NBH", which looks like some kind of 
control character. So where would that one come from, in a html page ?

> WARNING: Handler caused Not an ISO 8859-1 character: <83>
> Not an ISO 8859-1 character: <83>
> at javax.servlet.ServletOutputStream.print(
> at 
> at

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message