tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pilho Kim <ph...@math.soongsil.ac.kr>
Subject Re: My patches for Tomcat 3.2 wrt mutlibyte characters
Date Thu, 07 Dec 2000 02:30:42 GMT


On Tue, 5 Dec 2000, Kazuhiro Kazama wrote:

> From: Pilho Kim <phkim@math.soongsil.ac.kr>
> Subject: My patches for Tomcat 3.2 wrt mutlibyte characters
> Date: Tue, 5 Dec 2000 09:47:38 +0900 (KST)
> Message-ID: <Pine.GSO.4.05.10012050946040.10702-100000@math.soongsil.ac.kr>
> > Try to visit
> > 
> >     http://www.javaclue.org/tomcat/patch32/dopatch.html
> > 
> > I hope that those would be adopted in TC 3.2.1.
> 
> We are developing similar patches for japanese users. Your patches
> have a few problems:
> 
> 1, Don't change a DEFAULT_CHAR_ENCODING constant in
> src/share/org/apache/tomcat/core/Constants.java
> 
> Web i18n basics is to specify "right" charset. If you specify charset
> explicitly in tomcat 3.2, your Web applications work well except for a
> few well-known problems (JSP's include charset & getParameter() etc. -
> they are resolved in Servlet API 2.3 & JSP 1.2).
> 
> And "iso-8859-1" is defined as default charset in Servlet API 2.3
> final draft (See 4.9, 5.4, javax.servlet.ServletResponse). It isn't
> desirable to introduce another i18n concept to Servlet API 2.2.
> 
> Of cource, some internal modules such as DefaultCMSetter must send a
> reply in platform native charset because of native library's localized
> messages etc. In this case, it is better to specify charset
> internally.
>

This is your big fault.
You should know the real meaning of default in API.
What is the default encoding in JVM?

 
 
> 2, Don't change Jasper's default encoding to a platform native
> character encoding.
> 
> This spoils platform independency of JSP files. For example, it is
> very popular to serve JSP files encoded in Shift_JIS (this is used on
> Windows-PC) on an unix-variant server which default charset is EUC-JP.
> 
> And JSP files specified charset work well except for "include"
> directive. 
> 
> JSP 1.2 provide "pageEncoding" attribute for this problem. But there
> are no walkaround in JSP 1.1. This is a serious problem. I think
> charset-inheritance mechanism is better if possible.
> 
> We have a plan to provide pageEncoding patch (JSP 1.2 feature
> implementation to JSP 1.1)for this problem and this patch will be used
> in user's own risk. But it isn't good to provide it officially.
>

Do you know which encoding in class file is used ?
It is not iso-8859-1, but utf8.
You should read my patch for ClassName

 
> 3, Don't use non-IANA charset.
> 
> Java's default encoding name is almost converter's name but isn't
> included in IANA registry.
> 
> But there is no best way to convert Java's encoding to IANA charset in
> current JDK & Tomcat.
> 
> I think that it is reasonable to use
> org.apache.tomcat.util.LocaleToCharsetMap (see ResponseImpl.java).
> 
> I send new patch of org.apache.tomcat.context.DefaultCMSetter. This
> patch sets charset to iso-8859-1 in the case that LocaleToCharsetMap
> returns null (specified locale isn't registered), but there may be a
> better code.
> 


My main patch is to solve the mixed URL-encoded strings.
You have never mentioned about that.
Read my patches more carefully.
 

Kim



Mime
View raw message