tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Luehe <Jan.Lu...@Sun.COM>
Subject Re: cvs commit: jakarta-tomcat-connectors/coyote/src/java/org/apache/coyote
Date Wed, 28 Jul 2004 16:48:55 GMT

>>luehe       2004/07/27 17:43:17
>>  Modified:    coyote/src/java/org/apache/coyote
>>  Log:
>>  Fixed Bugtraq 6152759 ("Default charset not included in Content-Type
>>  response header if no char encoding was specified").
>>  According to the Servlet 2.4 spec, calling:
>>    ServletResponse.setContentType("text/html");
>>  must yield these results:
>>    ServletResponse.getContentType() -> "text/html"
>>    Content-Type response header -> "text/html;charset=ISO-8859-1"
>>  Notice the absence of a charset in the result of getContentType(), but
>>  its presence (set to the default ISO-8859-1) in the Content-Type
>>  response header.
>>  Tomcat is currently not including the default charset in the
>>  Content-Type response header if no char encoding was specified.
> -1.  This gets us right back to the same old problem where we are sending
> back "image/gif; charset=iso-8859-1", and nobody can read the response.

yes, sorry, I had forgotten about that case.

> If we're not going to assume that the UA believes that the default encoding
> is iso-8859-1 (which is what we are doing now),

I think the reason the spec added the requirement to clearly identify
the encoding in all cases (when using a writer) was because many
browsers let the user choose
which encoding to apply to responses that don't declare their encoding,
which will result in data corruption if the response was encoded in
ISO-8859-1 and the user picks an incompatible encoding.

> then I'd suggest simply
> doing:
>    setCharacterEncoding(getCharacterEncoding());
> in Response.getWriter (since the spec only requires that we identify the
> charset when using a Writer, and we don't really know what it is when using
> OutputStream).

The problem with this is that if you call getWriter() (with your 
proposed fix) followed by getContentType(), the returned content type
will include a charset, which is against the spec of getContentType():

   * If no character encoding has been specified, the
   * charset parameter is omitted.

This is why we need to append the default charset to the value of the
Content-Type header, if no char encoding has been specified.


> ------------------------------------------------------------------------
> This message is intended only for the use of the person(s) listed above as the intended
recipient(s), and may contain information that is PRIVILEGED and CONFIDENTIAL.  If you are
not an intended recipient, you may not read, copy, or distribute this message or any attachment.
If you received this communication in error, please notify us immediately by e-mail and then
delete all copies of this message and any attachments.
> In addition you should be aware that ordinary (unencrypted) e-mail sent through the Internet
is not secure. Do not send confidential or sensitive information, such as social security
numbers, account numbers, personal identification numbers and passwords, to us via ordinary
(unencrypted) e-mail.
> ------------------------------------------------------------------------
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message