tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
Subject Re: still suffering character-encoding woes
Date Thu, 21 Feb 2002 19:32:19 GMT

Hi, Richard.

     I can definitely relate.  This frustrated (sort of still does frustrate)
me to no end.  Let me try to explain what I understand after wrestling with
this for a while.  (gurus, if anything isn't quite right, please chime in!)

1.)  The following are all supposed to have the same effect:
     - setting the <META HTTP-EQUIV="ContentType" CONTENT
     - using the JSP page directive <%@ page contentType
="text/html;charset=UTF-8" %>
     - specifying response.setContentType("text/html;charset=UTF-8");

     These are all supposed to tell the browser what encoding to use when
displaying the page it gets.  I have found that the JSP directive works the
best for me.  The META tag didn't seem to be consistently working for me, and
I find the JSP page directive easier to use than the response.setContentType.

2.)  On the receiving end, when values are passed to you in the request,
that's another issue.  I'm not too sure how the charset for passing parameters
is determined.  I believe that they will be encoded according to the encoding
of the page with the submitting form.  I've also seen some discussion on
specifying the char set in the Form tag, but it sounds like that doesn't work
very well.  Anyways, I have been able to get my parameters ok by setting the
encoding for the page with the JSP directive mentioned above and then getting
the parameter values like this...

          String param = request.getParameter("param");
          byte[] bytes = param.getBytes("UTF-8");
          String paramForDB = new String( bytes, "UTF-8" );

     I don't really understand what kind of transformation is happening with
the getBytes and creation of the new String (***Can anyone else explain this
to me?***), but it seems to get the job done.

3.)  Also, make sure that your database can handle "high-bit" or extended
ASCII values (greater than 127) in order to store the UTF-8 encoded data.

Hope this helps.  Please let me know if you come to an understanding of what's
going on behind the scenes (I would LOVE to know!).  Thanks.

                    Sand"                To:     "Tomcat Users List" <>
                    <rsand@vgalle        cc:                                          
          >            Subject:     still suffering character-encoding
                    11:55 AM                                                             
                    respond to                                                           
                    "Tomcat Users                                                        

Hi all,

I've read with interest the recent threads about how to get posted form data
to be handled properly containing special alphabetic characters used in many
european languages.  I've tried every suggestion that I saw in the threads
to no avail, and am starting to tear my hair out.  A quick summary:

My development environment is Apache1.3.20, Tomcat 4.0.1, DB/2 on
Windows2000, locale = Norwegian
My production environment switches the database to Postgres and OS to Linux,
environment has LANG=C and LC_CTYPE=iso-8859-1

Basically, on my development environment, everything works perfectly- i can
post data containing norwegian characters and they are stored properly in
the database and logged properly in the log files.

On the production server, its '?' everywhere.

Now to fix the problem, I've tried the following steps, in sequence:

1) I cut apache out of the loop and accessed tomcat directly to see if it
interfered at all- no change
2) I added <META HTTP-EQUIV="Content-Type"
CONTENT="text/html;charset=UTF-8"> to inside my html-head tags
3) I added <%@ page contentType="text/html; charset=UTF-8"> to my JSP pages
4) Finally, I tried doing request.setCharacterEncoding("UTF-8") at the top
of my doGet and doPut methods

None of the first 3 steps helped- still that '?'- I should point out that if
I did <%=request.getParameter("someparam")%> in my JSP page I saw my special
characters echoed back; somehow the corruption of my post data happens only
when I write the data to a log file or into a database.

And when I tried step 4, suddenly all of my strings got terminated whenever
a special character occured- in other words, instead of a '?' it was as if
the string was terminated by a '\0'.

Can anyone help?!?!


Best regards,


To unsubscribe:   <>
For additional commands: <>
Troubles with the list: <>

To unsubscribe:   <>
For additional commands: <>
Troubles with the list: <>

View raw message