tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: Migrating to tomcat 6 gives formatted currency amounts problem
Date Fri, 12 Sep 2008 16:54:47 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

André,

André Warnier wrote:
> It is on the way through that servlet that they get "corrupted", unless
> I start Tomcat with LC_CTYPE="iso-8859-1".

What do the HTTP headers say when the file is served correctly versus
when it is not? I suspect that the encoding is either set incorrectly or
not set at all unless you specify LC_CTYPE.

> So my question remains, I think : what could be going on in that servlet
> so that :
> - if LC_CTYPE is not set in the environment *of Tomcat* when it starts,
> the upper iso-8859-1 characters in the pages are replaced by "?"
> - if LC_CTYPE is set to "iso-8859-1" in the Tomcat environment when it
> starts, then the pages delivered by the servlet are correct
> ?

My guess is that the magic servlet here is using the platform's default
encoding in the HTTP headers, which may be incorrect for the static file
in question.

> I am not very qualified in Java, but could it be something like :
> - the servlet reads those documents with some InputStream, without
> specifying a character set or encoding

Note that InputStreams are encoding-less. Sounds like semantics, but
encodings only come into play with you are dealing with
character-oriented streams which, in Java, are called Readers and
Writers. Note that neither InputStream nor OutputStream have any methods
that deal with the char data type.

> and by default that means to use
> Tomcat's idea of its default LC_CTYPE for those InputStreams ?
> - or the servlet outputs the document via an OutputStream without
> specifying an encoding etc..

I'll bet a binary stream of data is being sent (that is, with no
interpretation or encoding) and that the JVM's default encoding is being
advertised by the server in the HTTP headers. That would certainly cause
the problem.

I've found that the default encoding on my Linux box is something I've
never heard of before: "file.encoding=ANSI_X3.4-1968". Since I have my
server configured properly (and don't really serve much in the way of
static content), the platform's default encoding doesn't matter: my
preferred encoding (UTF-8) is always used.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkjKntcACgkQ9CaO5/Lv0PAjWACgquvyCh3SDJdqBxPPx3+zOwQ4
z3QAoKL8C5k0ZI3B6Hl4GyuDcZrcnrRf
=HPFJ
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message