tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Forrest R. Girouard" <Forrest.Girou...@openwave.com>
Subject Re: [PATCH] '8859_1' is not a valid charset alias
Date Fri, 18 May 2001 19:40:04 GMT

It is my understanding that '8859_1' is an alias for a Java encoding 
which maps to the 'ISO-8859-1' character set.  The Java encoding and
the character set name are not always the same.

Furthermore, while it's not readily apparent using 'ISO8859_1' for
the Java encoding is far preferable to using '8859_1' (or anything 
else) under Java 2.  

Look at the private getBTCConverter() method in the String.java source
and note the use of the following:

	!encoding.equals(btc.getCharacterEncoding())

The ByteToCharConverter instance for ISO-8859-1 always returns 'ISO8859_1'
for the getCharacterEncoding() method and this means that while other
names may work the ThreadLocal caching will be subverted.  Since the
ByteToCharConverter.getConverter() method involves synchronization it
is not a good thing to subvert the ThreadLocal cache.

Cheers,
	Forrest

Vincent Schonau wrote:
> 
> [this has also been entered as bug #1808]
> 
> Both Tomcat and Apache have the string '8859_1' hard-coded and as a public
> static final String in several places.
> 
> Although Java accepts '8859_1' as an alias for the ISO-8859-1 character set,
> this isn't a valid name anywhere else; the valid aliases are listed at
> 
> <URL:http://www.iana.org/assignments/character-sets>
> 
> Some user-agents (I first noticed this on an older version of Lynx) are
> confused by this.
> 
> This patch will:
> 
>   - remove all references in code (not comments) to '8859_1'
>   - In classes where this string was used, add a
>     public static final String DEFAULT_CHAR_ENCODING
>     if none was present (this is the most frequently used name
>     when such a field is present)
>   - In the src/org/apache/jasper tree:
>     - add a
>       public static final String DEFAULT_CHAR_ENCODING
>       to Constants.java
>     - replace all occurrences of '8859_1' in code
>       with Constants.DEFAULT_CHAR_ENCODING
>       as this seems to me be the proper way to do this in Jasper.
> 
> Regards,
> 
> Vince.
> 
>   ------------------------------------------------------------------------------------------------------------------------
> 
>    iso2.patchName: iso2.patch
>              Type: Plain Text (text/plain)

-- 
Forrest Girouard @ Openwave Systems Inc.
phone: +1-650-817-1556
mailto:Forrest.Girouard@openwave.com
http://www.openwave.com



Mime
View raw message