tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tõnu Põld <>
Subject RE: foriegn characters turn to ? in database
Date Thu, 12 Jul 2001 10:23:22 GMT
It should work, if the Java has UTF-8 support -- and it does have.

Another case might be when the Java VM doesn't have the international
character encoding support. Some time ago I had trouble with it -- so you
must get the "International" version of JDK (I know that Sun's JDK download
page has two choises, one of which is "International version").

But be aware that most Internet browsers do not send the encoding in HTTP
request. So at server side we must guess what encoding do the bytes (in the
request) have. By default Tomcat 3.2 assumes that they are Latin1. So if you
post a request from a page which is in UTF-8 then the browser sends the
request in UTF-8, but because the encoding is missing, then Tomcat converts
bytes to strings using Latin1 encoding. To convert them correctly you cold
use something like: 

String s = new String(
request.getParameter("my_param").getBytes("ISO-8859-1"), "UTF-8");

Also, some browsers might send UTF-8 request bytes in form %XXXX, don't know
how does Tomcat understands that.


> -----Original Message-----
> From: James Radvan []
> Sent: Thursday, July 12, 2001 12:23 PM
> To: ''
> Subject: RE: foriegn characters turn to ? in database
> Out of interest, will using "charset=UTF-8" work? (unicode)
> James
> ---------------------------------
> James Radvan
> Websphere Analyst/Architect
> London, UK
> +44 7990 624899

View raw message