Return-Path: Delivered-To: apmail-jakarta-tomcat-user-archive@apache.org Received: (qmail 59920 invoked from network); 10 Jun 2003 11:24:32 -0000 Received: from exchange.sun.com (192.18.33.10) by daedalus.apache.org with SMTP; 10 Jun 2003 11:24:32 -0000 Received: (qmail 3095 invoked by uid 97); 10 Jun 2003 11:26:47 -0000 Delivered-To: qmlist-jakarta-archive-tomcat-user@nagoya.betaversion.org Received: (qmail 3088 invoked from network); 10 Jun 2003 11:26:47 -0000 Received: from daedalus.apache.org (HELO apache.org) (208.185.179.12) by nagoya.betaversion.org with SMTP; 10 Jun 2003 11:26:47 -0000 Received: (qmail 57051 invoked by uid 500); 10 Jun 2003 11:23:53 -0000 Mailing-List: contact tomcat-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Tomcat Users List" Reply-To: "Tomcat Users List" Delivered-To: mailing list tomcat-user@jakarta.apache.org Received: (qmail 56986 invoked from network); 10 Jun 2003 11:23:51 -0000 Received: from relay05.indigo.ie (194.125.133.229) by daedalus.apache.org with SMTP; 10 Jun 2003 11:23:51 -0000 Received: (qmail 54173 messnum 1195351 invoked from network[213.94.193.34/unknown]); 10 Jun 2003 11:23:50 -0000 Received: from unknown (HELO HOLODECK) (213.94.193.34) by relay05.indigo.ie (qp 54173) with SMTP; 10 Jun 2003 11:23:50 -0000 Message-ID: <00b101c32f43$26229220$39e1a8c0@HOLODECK> From: "Andoni" To: "Tomcat Users List" Subject: How to UTF-8 your site. Date: Tue, 10 Jun 2003 12:26:33 +0100 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_00AE_01C32F4B.876B1360" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2800.1106 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N ------=_NextPart_000_00AE_01C32F4B.876B1360 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hello, I have recently completed the torturous process of translating my = web-site into 16 European languages. Having had lots of advice from = this list and other sources I have come down to a few conclusions about = what a Java / Tomcat web-site needs in order to fully support UTF-8. These are: 1. JSP pages must inlcude the header: <%@ page contentType=3D"text/html; charset=3DUTF-8" %> 2. In the Catalina.bat (windows) catalina.sh (windows) = apache$jakarta_config.com (OpenVMS), file there must be a switch added = to the call to java.exe. The switch is: -Dfile.encoding=3DUTF-8 I cannot find documentation for this environment variable anywhere or = what it actually does but it is essential. 3. For translation of inputs coming back from the browser there must be a = method that translates from the browser's ISO-8859-1 to UTF-8. It seems = to me that -1 is used in all regions as I have had people in countries = such as Greece & Bulgaria test this and they always send input back in = -1 encoding. The method which you will use constantly should go = something like this: /** * Convert ISO8859-1 format string (which is the default sent by IE * to the UTF-8 format that the database is in. */ public String toUTF8(String isoString) { String utf8String =3D null; if (null !=3D isoString && !isoString.equals("")) { try { byte[] stringBytesISO =3D isoString.getBytes("ISO-8859-1"); utf8String =3D new String(stringBytesISO, "UTF-8"); } catch(UnsupportedEncodingException e) { // As we can't translate just send back the best guess. System.out.println("UnsupportedEncodingException is: " + = e.getMessage()); utf8String =3D isoString; } } else { utf8String =3D isoString; } return utf8String; } I have found that these three steps are all that is necessary to make = your site accept any language that UTF-8 can work with. I extend my = thanks to those of you on the Tomcat users list who helped me find these = little gems. Kind regards, Andoni. ------=_NextPart_000_00AE_01C32F4B.876B1360--