jakarta-taglibs-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "tae_grt@newmail.ru" <tagu...@newmail.ru>
Subject i18n charset issues
Date Fri, 09 Nov 2001 20:32:21 GMT
Hello all!

I'm a newcomer, just spend a few days with i18n taglibs
and the charset troubles and i beleive i have bad
news:

1) If we do a response.setLocale() (at least
in a jsp, haven't tested this in a servlet)
then the value output as
Content-Type:text/html; charset=xxx
is not changed (it is still iso-8859-1)
I have tested this on Tomcat 4.01, 3.3m3
and Weblogic 6.0sp1. Same result.
(I tried response.setLocale(Locale.JAPANESE),
         response.setLocale(new Locale("ru","RU));
)

2) If we do a
<%
response.setContentType("text/html; charset=windows-1251")
%>
anywhere in the page with Tomcat 4.01
then the we have
Content-Type:text/html; charset=windows-1251
alright, but if

buffer="none" then

an attempt to do <%="\u041f"%>
(a cyrillic letter) gives us a question mark ?

3) If we do a
<%
response.setContentType("text/html; charset=windows-1251")
%>
anywhere in the page with Tomcat 4.01
and <%@ page buffer="16k" %>
then cyrillic letters come out all right

4) If we do a
<%
response.setContentType("text/html; charset=windows-1251")
%>
on Weblogic anywhere in the page (including the very
beginning) even if we buffer="16k" we _do not get cyrillic letters_
they all come as ??????


Conclusion:

1. response.setLocale() doesn't change charset both with Tomcat and
   weblogic
2. explicit response.setContentType("text/html; charset=xxx")
   from inside a jsp works if we have buffering on, otherwise
   the writer replaces all non-latin1 chars with ???-s
3. explicit response.setContentType("text/html; charset=xxx")
   doesn't work if we have buffer="none" in Tomcat and
   doesn't work at all in Weblogic

(The reason is most likely that

out = pageContext.getOut();

happens in a jsp before any code inside jsp
is executed, and this most likely calls
response.getWriter()
and one should call
response.setContentType() and
response.setLocale() before
response.getOut() according to the spec.

So bad news: it's impossible to do
dynamic charset switch from a jsp
in a portable way.

We can do it for Tomcat for the case
when the buffering is on though, but
it's incompatible.

If you're intrested i can send
my propositins on how to set
the charset for this case
explicitly (as .setLocale() does
not set charset)

My ideas include adding attributes
to the <i18n:bundle> and <i18n:locale>
tags to control setting the charset
or having a special tag for that
<i18n:charset>

but i feel so depressed by not
being able to make this code
work for anything but Tomcat+buffering
that i'm not sure if these ideas are
usefull :-((

BTW: it makes perfect sense on
Servlet 2.3 containers to
do

request.setCharacterEncoding(response.getCharacterEncoding());

just at the beginning of the page.
Should we introduce a new tag to the i18n or
somewhere else to do that?
I know there's some request interceptor in Tomcat 3.3
that does a similar thing: it remembers the encoding
that last page of the session was served with
and uses it to decode request parameters, but
the line of code above is totally standard compiant and
helps say in the following example:

<%@ page contentType="text/html; charset=windows-1251" %>
<% request.setCharacterEncoding(response.getCharacterEncoding()); %>


<html>
<body>
<form action="<%=request.getRequestURI()%>">
  <input type="text" name="n">
  <input type="submit">
</form>
n=<%=request.getParameter("n")%>
</html>

This works fine for those who have cyrillics, but if we remove
the .setCharacterEncoding it stars failing with tomcat 4.01
-- 
Best regards,
 Anthony Taguov                          mailto:tagunov@newmail.ru


--
To unsubscribe, e-mail:   <mailto:taglibs-dev-unsubscribe@jakarta.apache.org>
For additional commands, e-mail: <mailto:taglibs-dev-help@jakarta.apache.org>


Mime
View raw message