tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rick" <>
Subject UTF-8 Encoding Issue Since 5.0.27 ( gun in my mouth )
Date Wed, 01 Sep 2004 02:44:09 GMT
Since 5.0.27, pretty much all of my UTF-8 i8 code seems to be messed up. 

The problem seems to have been caused by whatever fix was created for issue
ServletResponse.setContentType sets response encoding after getWriter was
called (Bugtraq 5062838) (luehe) 

Now it seems almost impossible to properly set the encoding type of some of
my JSPs and all of my Servlets that return UTF-8 XML data.

As an example, my login page allows the user to switch to Japanese text.
Text data is read with a ResourceBundle, which reads from a UTF-8 encoded
.properties file.

If the encoding of the .jsp page itself is in ASCII, then I can't get the
characters to show up at all any more.
I have to save the .jsp page as UTF-8.  
Added "set JAVA_OPTS=-Dfile.encoding=UTF-8" to my catalina.bat file

Then, If I try to set a character set in my page header, it messes up.

This works in some cases...
<%@ page language="java" import="java.util.*" contentType="text/html" %>
response.getCharacterEncoding() = "ISO-8859-1"

The really scary part is that with no meta or charset actually set, that the
browser(IE) correctly changes to UTF-8 and displays the content fine.   But
if I change the actual file encoding of the .jsp page from UTF-8 back to
ASCII. Then IE does not change to UTF-8 and the page is messed up again.
Why does the actual encoding of the .jsp file itself dictate the response
sent to the client?    

It appears that the actual encoding of the source file someone how gets past
along and then I'm unable to alter the character encoding, and if I try, it
just causes everything to go to hell.

This use to work before 5.0.27, but now doesn't, even though all data and
pages are encoded in UTF-8.
<%@ page language="java" import="java.util.*" contentType="text/html;
charset=UTF-8" %>
response.getCharacterEncoding() = "UTF-8"

Before 5.0.27, all I had to do to get my output in UTF-8 was ...
 contentType="text/html; charset=UTF-8"

Now I have to mess with the actual .jsp file page encodings and still can't
get most to work properly as well as none of my servlets will return correct
UTF-8 data.  

I have tried setting "pageEncoding" in the page tag as well with no luck.

Thanks for anyone's insight or help on this, its never fun to find out that
something that had been working quite solid , up and blows up for no good

Current dev machine is on windows xp by the way, vanilla install of Tomcat
I will be setting this up on a Linux box for more testing shortly.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message