tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allistair Crossley" <Allistair.Cross...@QAS.com>
Subject RE: UTF-8 Encoding in Jsp
Date Wed, 01 Dec 2004 10:54:04 GMT
Hi,

These encoding issues are always a nightmare ;) 

There are some relevant areas of the Servlet spec you may want to look at wrt encoding, notably
(Internationalization and Request data encoding).

In terms of UTF-8 not coming back correctly from your database you need to ensure that when
they were _added_ that the character encoding was UTF-8. You should also verify yuor database
is in UTF-8 mode. If both these statements are true, then you need to read Internationalization
in the Servlet spec which says 

"If the servlet does not specify a character encoding before the getWriter
method of the ServletResponse interface is called or the response is committed,
the default ISO-8859-1 is used."

In other words, you need to call setLocale or setCharacterEncoding before the response is
committed. I am not entirely sure whether that is actually what that JSP page directive is
doing, maybe it is. Perhaps in your JSP you can output <%= request.getCharacterEncoding()
%> to make sure your UTF-8 has been set. If it is null, it has not been set. If it _is_
UTF-8 then the character data is either not actually UTF-8 coming from the database either
because a) your database driver connection URL is not operating in UTF-8 mode, b) the data
when put into the database was not UTF-8 or c) the database is not running UTF-8.

In terms of sending data to the database as UTF-8 check your driver parameters (normally on
the URL string) and also database setting. You also need to take note of this section of the
Servlet spec. We had to write a servlet filter to change our inbound form posts to the correct
encoding for our database Cp1252.

Request data encoding extract 

The default encoding of a request the container uses to create the
request reader and parse POST data must be ISO-8859-1 if none has been
specified by the client request. However, in order to indicate to the developer in this
case the failure of the client to send a character encoding, the container returns null
from the getCharacterEncoding method.

If the client hasn't set character encoding and the request data is encoded with
a different encoding than the default as described above, breakage can occur. To
remedy this situation, a new method setCharacterEncoding(String enc) has
been added to the ServletRequest interface. Developers can override the
character encoding supplied by the container by calling this method. It must be
called prior to parsing any post data or reading any input from the request.

Hope this info gets you thinking, Allistair.

> -----Original Message-----
> From: Arnab Chakravarty [mailto:achakravarty@sapient.com]
> Sent: 01 December 2004 10:38
> To: Tomcat Users List
> Subject: RE: UTF-8 Encoding in Jsp
> 
> 
> Hi,
> 
> Thanks for the reply but it did not work. May be I didn't explain the
> problem correctly.
> 
> I am running an application that supports all the languages 
> but only in
> some specific places of the application and I have made those places
> UTF-8 complaint.
> 
> Further, they are being saved to Database (Oracle 9). When we are
> reading the data back from the database, junk characters are displayed
> on the screen. Yes, the database is set to support UTF-8 Encoding and
> this is working with the old version of tomcat 3.3 and not 
> with current
> upgraded version of tomcat 5.0
> 
> There are also places in the application where drop downs contain some
> different language support and we can see those charsets (Japanese,
> Chinese etc) appearing. Only, when I try to display on the screen
> through the jsp file, I am encountering this problem of junk 
> characters
> begin displayed.
> 
> Hope I have set more context around the problem. Please help 
> me resolve
> this issue.
> 
> Thanks,
> Arnab
> 
> -----Original Message-----
> From: Mariano [mailto:mlopez@sescam.org] 
> Sent: Wednesday, December 01, 2004 12:54 PM
> To: 'Tomcat Users List'
> Subject: RE: UTF-8 Encoding in Jsp
> 
> You should use too:
> 
> <head>
> 	<META http-equiv="Content-Type" content="text/html;
> charset=UTF-8">
> </head>
> 
> and this scriptlet:
> 
> 	request.setCharacterEncoding("UTF-8");
> 
> at the beginning.
> 
> I hope this help you
> 
> -----Mensaje original-----
> De: Arnab Chakravarty [mailto:achakravarty@sapient.com]
> Enviado el: martes, 30 de noviembre de 2004 15:28
> Para: Tomcat Users List
> Asunto: UTF-8 Encoding in Jsp
> 
> 
> Hi all,
> 
> I need to make my all jsp files compatible with UTF-8 
> Encoding and even
> though I am using the directives:
> 
> <%@ page pageEncoding="UTF-8"%>
> <%@ page contentType = "text/html;charset=UTF-8"%>
> 
> in the jsp files, cannot make it work.
> 
> Using tomcat version 5. Is there any config changes I need to make for
> the UTF-8 Encoding to work.
> 
> Please help.
> 
> Thanks in advance,
> Arnab
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: tomcat-user-help@jakarta.apache.org
> 
> 


<FONT SIZE=1 FACE="VERDANA,ARIAL" COLOR=BLUE> 
-------------------------------------------------------
QAS Ltd.
Developers of QuickAddress Software
<a href="http://www.qas.com">www.qas.com</a>
Registered in England: No 2582055
Registered in Australia: No 082 851 474
-------------------------------------------------------
</FONT>


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Mime
View raw message