tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Mallwitz <>
Subject tomcat 4.0 m4: bug while submitting UTF-8 data to JSP page
Date Mon, 13 Nov 2000 17:23:15 GMT

I have a JSP file (see attachment) which lets you submit text in UTF-8 to
the same JSP file. For this to work the JSP file contains code for
converting the submitted text from Unicode to UTF-8. 

I run some test to submit the Euro symbol. In Unicode this is code point
0x20ac and in UTF-8 it is 0xE2 0x82 0xAC (3 bytes). It works for all servlet
engines I know of incl. Tomcat up to 3.2 beta 6 but not for Tomcat 4.0m4

if you have an URL like http://host/post.jsp?text=%E2%82%AC I expect the
following output:

text [as text]   = â'¬
text [as hex]    = 0xe2 0x82 0xac 
text [corrected] = EUR

but I get

text [as text]   = â'¬
text [as hex]    = 0xe2 0x201a 0xac 
text [corrected] = 

Note the second hex code. Interestingly 0x201a is a Unicode code point
containing a , character but I'm clueless how Tomcat got there ...

PS: I have attached a JSP file for more multibyte samples ...
Christian Mallwitz INTERSHOP Communications Germany
Senior Software Engineer    phone: +49 3641 894 334

View raw message