tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Vanspall" <>
Subject Internatinalisation Question
Date Mon, 15 Apr 2002 00:20:57 GMT
Ok having tested a bit more, I think I can give a clearer description of my

I am currently in the process of making my application multilingual.

I have succesfully altered my database to be such, and it uses UTF-8
character set now.

I have changed the meta-inf tag to set the charset to UTF-8.

Retrieving information from the database seems to be ok, however, all the
pages have forms for entering/altering data. If I enter foreign characters
into the form the are received in the database as a string of HTML style
character codes.

e.g. &#23445;&#34259;&#54301;

those aren't the exact integer, but that is the pattern.

now the character encoding filter in tomcat 4.0.3 is not doing anything with
these characters because it is reading them one by one '&' '#' '2' '3' '4'
'4' '5' ';' and finding them to be normal characters does not try to convert

I have then added to the request interception method (doFilter) and added a
method that strip the '&#' and ';' from either end of the number. It then
creates and int out of the reamining string ('23445'). When I cast this int
to a char, it seems to come up with the correct character when I debug. This
is correct right up until I try to sonvert the string to UTF-8 or just enter
it into the database. It then becomes '????'.

My questions are:

1. as I don't have a foregn keyboard, and am entering the characters in
using the Windows Character map; am I entering them in in a form that is not
the same as if someone using a chinese keyboard would enter them?

I.E. is the encoding different. Given that the java code seems to be ok with
the integer as chars, I am thinking this is not the case.

2. Is there something I am doing wrong with the conversion? At the moment I
am doing new String(origString.getBytes(), "UTF-8");

3. If I am entering them in incorrectly; is there an emulation tool that can
help me enter the character in correctly?

To unsubscribe:   <>
For additional commands: <>
Troubles with the list: <>

View raw message