tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <>
Subject Re: charset encoding bug
Date Tue, 24 Apr 2007 22:43:39 GMT
Hash: SHA1


Sean Bridges wrote:
> I did a little more digging, and it seems this bug
> only appears when the locale is set.  My full servlet
> code is,


>     arg1.setLocale(arg0.getLocale());
>     arg1.setContentType("application/foobar");
>     arg1.getOutputStream().
>         write(arg1.getContentType().getBytes());


> In this case the response will be,
> application/foobar;charset=ISO-8859-1

The character set has to be chosen at some point. It looks like what you
are suggesting is that you want to actually report an incorrect
character set (or none, which is just as bad) to the client.

You're right, the encoding can be determined by an XML-aware client at
the other end by looking at the BOM or by reading the "encoding"
attribute of the XML processing instruction. On the other hand, your
Writer must have an encoding set before you can write to it, so you
don't have a choice of encodings in the first place.

If the encoding of your response is ISO-8859-1, then your XML emitter
had better be either using the existing servlet-manager Writer (which
already has an encoding, and there is no reason to specify any encoding
of any type) or writing bytes to the same output channel using the
/same/ encoding as the Writer would have.

In either case, the two either match up, or you will have problems.

Tomcat is appending the character set that will be used whether you like
it or not. If you don't like the character set, then change it. But you
can't simply strip the character set off a Writer and then write "UTF-8"
in the processing instruction for your XML and expect everything to work

Why not pick a character set and stick with it everywhere? UTF-8 is
usually a good choice.

- -chris

Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla -


To start a new topic, e-mail:
To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message