tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nikola Milutinovic <Nikola.Milutino...@ev.co.yu>
Subject Re: TC 5.0.14 Breaks UTF-8 Content via HTTP Header
Date Tue, 11 Nov 2003 07:06:58 GMT
Tony LaPaso wrote:

> Here's What I Did
> -----------------
> In both versions of TC, I added an "em dash" character to the
> "/tomcat-docs/cgi-howto.html" documents that come with the TC documentation.
> The UTF-8 representation for the "em dash" character is the three bytes
> 0xE28094. I also made sure both documents had the following META tag in its
> <head>:
> 
> <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/>

This constitutes a correct HTML document, with respect to the actual and 
announced document encoding.

> Here's What I Saw (TC v5.0.14)
> ------------------------------
> Under TC v5.0.14 the "em dash" character was rendered as *THREE SEPARATE
> CHARACTERs* (one for each byte). Moreover, putting a sniffer on the HTTP
> stream indicated the following response header was being sent by the v5.0.14
> Coyote Connector:
> Content-Type: text/html;charset=ISO-8859-1

First of all, was that a HTML or JSP? If it was JSP, then unless you specify 
your page encoding in JSP Page directive, Tomcat will and should use default 
encoding for HTTP headers.

Secondly, what is actually sent in TC 5.0.12 case?

> Conclusion (?)
> --------------
> It seems that v5.0.14 of the Coyote Connector is incorrectly sending the
> wrong response header. I'm not sure what the HTTP spec says *should* be sent
> for the header if the document's <head> contains:
> 
> <meta http-equiv='Content-Type' content='text/html; charset=utf-8'/>

This is part of HTML specification, which lets page author circumvent the HTTP 
header sent by the server. All clients are invited (but not forced) to follow 
<meta> tags, instead of HTTP headers.

For static content, like HTML pages, you cannot specify page encoding, other 
than default, on the fly. For dynamic content, like JSP, you have JSP Page 
directive in which to do it, like this:

<%@ page
   info="A test page"
   contentType="text/html; charset=utf-8"
%>

Nix.


---------------------------------------------------------------------
To unsubscribe, e-mail: tomcat-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: tomcat-user-help@jakarta.apache.org


Mime
View raw message