tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Markus Schönhaber <tomcat-us...@list-post.mks-mail.de>
Subject Re: DefaultServlet doesn't set charset
Date Mon, 11 Aug 2008 16:44:54 GMT
Christopher Schultz wrote:

> Have you rigged the servlet to add a static charset defined in, say,
> web.xml or something like that?

In a way, yes. DefaultServlet already uses the value of the fileEncoding
init-param, if set, as encoding when reading static content from disk.
So, if fileEncoding is explicitly set in web.xml, I also use it's value
for the charset info in the response header.

> Is there any logic to guess the actual
> charset?

Depends on what charset you mean.
- The charset of a file on disk? Then no, I haven't touched the code for
reading files from disk - and I don't intend to.
- The charset added to the Content-Type response header? Then yes. If
fileEncoding (see above) is not set, the value from
java.nio.charset.Charset.defaultCharset().name()
is used.
BTW: this is OK for Tomcat 6. But if anyone was interested to port this
to an older version of Tomcat which is supposed to be able to run
pre-1.5 JVMs, he should keep in mind that this has to be changed. For
example into something like
(new OutputStreamWriter(new ByteArrayOutputStream())).getEncoding()

> Are you actually setting the character set of the response's
> Writer?

No. But a good point!
As I understand it, DefaultServlet always tries to use the
ServletOutputStream. Only if response.getOuptuStream() fails with an ISE
and the media type is text/* or *xml, it tries to use the Response
object's PrintWriter. So, if the latter is the case and something other
than the platform default encoding should be used, it might be sensible
to set the encoding for the writer.
I have to think about this some more - especially about a real world
example that triggers this.

> I'd love to take a look at your patch.

No problem. You can get it here:
http://www.ddt-consult.de/sendCharset.patch

> I would definitely add some tests to verify correct behavior when the
> charset is set to something that is not sane (like ";;;;").

Hm, yes, one could add a sanity check. But I'd expect people who set
fileEncoding explicitly to not only know what they're doing but to also
check if what they did actually works.

Thanks for your input, Chris.

Regards
  mks

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message