tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "mech" <m...@rz.fh-augsburg.de>
Subject RE: Handling non-Latin chars in servlet, jdbc
Date Tue, 11 Feb 2003 12:20:37 GMT


> -----Original Message-----
> From: Joe Tomcat [mailto:tomcat@mobile.mp] 
> Sent: Sonntag, 9. Februar 2003 12:47
> To: Tomcat Users List
> Subject: Handling non-Latin chars in servlet, jdbc
> 
> 
> Hello fellow Tomcatists,
> 
> It is time for my web app to move beyond the confines of the A-B-Cs. 
> This app takes user input from web forms, stores it in 
> various fields in a database, and then displays it back in 
> various ways.  The goal is to have it so that a user can 
> enter Japanese or other Asian language chars into the form in 
> his browser, the web app stores the form input in the db, and 
> later on, displays it back to the browser and the chars show 
> up the right way.
> 
> It seems like this should be easy.  Java is designed for 
> multibyte, and I think Postgres can also store multibyte 
> chars, but I'm running into a block.  My friend in Japan 
> entered some chars into a form, and hit submit, and what was 
> stored in the db were html entities.  Then, when he displayed 
> it back to his browser, it was a problem because my output 
> code automatically escapes html entities, so what he saw was 
> "&48832;" or something, instead of the ji he was expecting.
> 
> Does anyone have some tips on this, or pointers to articles 
> or books I should be reading about how to do this?
> 
First:
Make sure that your generated html page has a content header that tells
the browser what content encoding you want. Otherwise your browser might
imply Latin for parsing even if you want Unicode.
Things like: <%@ page contentType="text/html; charset=xyz" %> or <meta
http-equiv="content-type" content="text/html; charset=xyz"> might help.

Second (from own bad experience ;-)):
I use MySQL which also support Unicode. But you have to set the encoding
you want MySQL to use. Otherwise it tries to find the encoding by
checking the systems default. Ran into trouble because my development
server had a German installation whereas my productive machine has an
English setup. So I had "Latin" vs. "Latin-1". I wondered what happend
to my German special characters, but actually the problem was that my
JDBC driver talked the wrong encoding to the database and the problem
was already located in my data access classes, not the jsp or html
processing. 
After telling MySQL in the connection url to use Latin-1 encoding my
problem was gone.

So you should also check that your problem is not located on the
database/driver side and your characters get garbled already there.


Michael


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Mime
View raw message