tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Luehe <Jan.Lu...@Sun.COM>
Subject Re: cvs commit: jakarta-tomcat-connectors/http11/src/java/org/apache/coyote/http11
Date Wed, 28 Jul 2004 17:05:32 GMT

Remy Maucherat wrote:
> Remy Maucherat wrote:
>> Cool. So after all the efforts I'm doing to optimize, you casually add 
>> GC, because the servlet API is completely stupid ?
>> So -1 for your patch: you need to rework it. I also didn't read all 
>> that funny stuff in the specification, so where does it come from ? ;)
> I thought about it some more, and in addition to the performance 
> problem, it would lead to adding charset even for binaries. We already 
> found out that this way very bad (in addition to being meaningless, it 
> breaks some clients such as Acrobat which simply does a check on the 
> MIME type without removing the charset, which is rather logical since 
> they're dealing with binaries) - this was some unintended behavior in 
> some 4.1.x release (if I remember well, it was introduced by you 
> already: you do like charset related issues ;) ).

yes, sorry, I had forgotten about that case.

Believe, I didn't commit yesterday's patch simply because
I don't have anything better to do. It's come up as a compliance
issue which had been masked by an unrelated problem that was fixed.

> So what I suggest is that you send that you send that as feedback to the 
> servlet API, and that they put a small errata for the specification.
> -1 for implementing the requirements of the specificatrion (although I 
> can't find them anywhere ;) ) to the letter right now, as they are 
> broken (I didn't dislike the previous behavior, so I want it to remain).


     * <p>Containers must communicate the character encoding used for
     * the servlet response's writer to the client if the protocol
     * provides a way for doing so. In the case of HTTP, the character
     * encoding is communicated as part of the <code>Content-Type</code>
     * header for text media types. Note that the character encoding
     * cannot be communicated via HTTP headers if the servlet does not
     * specify a content type; however, it is still used to encode text
     * written via the servlet response's writer.

Reason: Many browsers let the user choose which encoding to apply to
responses that don't declare their encoding, which will result in data
corruption if the response was encoded in ISO-8859-1 and the user
picks an incompatible encoding.


     * If no character encoding has been specified, the
     * charset parameter is omitted.

Reason: Allow setLocale() to set the response's char encoding
(corresponding to the locale) if the char encoding has not already
been set by setContentType() or setCharacterEncoding().

You can find corresponding wordings in the servlet spec.

Therefore, we need to add a charset to the Content-Type response header if:

- no response encoding has been specified (ie., the return value of
   ServletResponse.getContentType() has no charset), and
- the response is using a writer

My patch did not check for the latter.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message