tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vincent Schonau <vince-jaka...@netnautics.com>
Subject Re: [PATCH] '8859_1' is not a valid charset alias
Date Sat, 19 May 2001 11:01:00 GMT
On Fri, May 18, 2001 at 12:40:04PM -0700, Forrest R. Girouard wrote:
> 
> It is my understanding that '8859_1' is an alias for a Java encoding 
> which maps to the 'ISO-8859-1' character set.  The Java encoding and
> the character set name are not always the same.
> 
> Furthermore, while it's not readily apparent using 'ISO8859_1' for
> the Java encoding is far preferable to using '8859_1' (or anything 
> else) under Java 2.  
> 
> Look at the private getBTCConverter() method in the String.java source
> and note the use of the following:
> 
> 	!encoding.equals(btc.getCharacterEncoding())
> 
> The ByteToCharConverter instance for ISO-8859-1 always returns 'ISO8859_1'
> for the getCharacterEncoding() method and this means that while other
> names may work the ThreadLocal caching will be subverted.  Since the
> ByteToCharConverter.getConverter() method involves synchronization it
> is not a good thing to subvert the ThreadLocal cache.

Thanks for pointing this out. AFAICS, the use of 'iso-8859-1' instead of
'8859_1' (my patch) does not make this situation any better or worse in the
tomcat code. <g>

The tomcat 3.x code doesn't look like it takes this into account at all. I
wonder if looking up the Java Encoding name associated with the encoding
name supplied by user-agents etc. is an optimisation worth making. I'll look
into that.



Vince.


Mime
View raw message