tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: specifying the content-type
Date Tue, 07 Jun 2011 20:33:22 GMT
Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Bernd,
> 
> On 6/7/2011 2:23 PM, Lentes, Bernd wrote:
>> Christopher Schultz wrote:
>>> How did you do it? If you use <META HTTP-EQUIV="Content-Type"
>>> CONTENT="text/html" />, it should override any Content-Type
>>> sent in the HTTP response headers
>> Yes, we used this. But http://de.selfhtml.org/html/kopfdaten/meta.htm#zeichenkodierung
(unfortunally only in german) says
>> "Im Konfliktfall, also wenn der Webserver im HTTP-Header eine hiervon abweichende
Angabe sendet, wird üblicherweise die Angabe des HTTP-Headers verwendet.", which means that,
if you have the META in the HTML-file and also the content-type in the HTTP-Header, mostly
the HTTP-Header "wins".
> 
> You're right. I had it wrong: the HTTP header overrides the content of
> the document.

Well, it /should/.  According to the HTTP RFC.
However, many IE versions (which unfortunately is still the most-widely used browser in 
corporate environments) don't give a damn about the Content-type sent by the server, if it

conflicts with their own sniffing of the content.
http://lmgtfy.com?q=ie+and+content-type

> 
>>>> Our developers try now to use the
>>>> response.setContentType("text/html"); method to configure the
>>>> content-type in the HTTP-Header.
>>> This is the proper way to do things. Using <META> does not hurt.
>>>
.. at least if both are consistent.

Meanwhile, and as long as your developers are fixing this, you may want to suggest to them

that they also add a character set indication to the Content-type, like :

Content-type: text/html;charset=UTF-8

using for example : response.setContentType("text/html; charset=UTF-8");

(if of course the pages you do send back are encoded using that character set/encoding).
And also add it to your <meta> tag, like so :
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8" />

That will save you other problems down the line, if any of these pages can also submit any

data back to the server, like in
<input name="STADT" value="München">

So to maximise your chances of everything working correctly in a country where not 
everyone speaks only English, the following elements should agree :
- the type (text/html) and charset indicated by the server in the Content-type header
- the type and charset indicated in the <meta> tag in the page itself
- the way the page itself was created on the server (with a UTF-8 aware editor, and saved

as UTF-8 without BOM)
In addition, if the page contains a <form> tag, make sure it has the following attribute
:
<form .... accept-charset="UTF-8">

The reason for all the above is that HTTP and HTML for historical reasons tend to default

to ISO-8859-1 as a character set, while everything to do with Java (like Tomcat) tends to

default to Unicode/UTF-8.  And by not being very precise and consistent, you always run 
the risk of mixing them up, which for languages like German leads to very difficult to 
debug data corruption problems, the least of which is losanges with "?" in them in your 
pages, instead of umlauts.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message