cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Uyttenhove <jan.uyttenh...@xume.com>
Subject Re: [once again] best way to set http header encoding
Date Tue, 17 Feb 2004 10:00:35 GMT
allright

If I find the time I'll provide a patch to this, because it is a very 
serious problem atm. I was investigating problem and solution already 
myself when you posted it to the maillist, because I can't work with the 
Apache directive, I will explain why.

The default encoding in the header was added not 'that' long ago[1]. I'm 
using Tomcat 4.1.29 now, and it *does* set a default encoding to 
ISO-8859-1. So in a way you're lucky you're still using Tomcat 4.1.12, 
because with 4.1.29, the apache directive has no effect when connecting 
to Tomcat :-) This is why I'm looking for a good solution in Cocoon...

We see and will see this issue more and more, as Bruno mentioned, but 
not only because of the browser implementation. In my opinion this is 
caused by the changes in this default encoding behaviour when using 
Tomcat. The very latest changes should change the behaviour again, so 
the upcoming Tomcat 4.1.30 should behave different again. I don't know 
what exactly the difference will be, this is wat the changelog for 
4.1.30 says:
" Restore the ability to explicitly set the charset to iso-latin-1. Now, 
you won't get the charset unless you ask for it (so no more 
Content-Type: image/gif; charset=iso-8859-1).  However, if you call 
response.setCharacterEncoding("iso-9959-1"), you now get it in the 
response."

So conclusion:
Somewhere between Tomcat 4.1.12 en 4.1.27 the default encoding behaviour 
has changed (my guess is 4.1.17), and there's currently no way in Cocoon 
to set the http header encoding so it matches the encoding of the 
serializer (and meta tag). Anyone upgrading an old Tomcat will have this 
problem, unless using ISO-8859-1. The upcoming Tomcat 4.1.30 should 
change the behaviour again, but I'm not sure in what way.


Jan

[1] 
http://cvs.apache.org/viewcvs.cgi/jakarta-tomcat-connectors/coyote/src/java/org/apache/coyote/Response.java



Stefan Burkard wrote:
> Jan Uyttenhove wrote:
> 
>>>
>>> i wonder that cocoons serializer just writes a meta-tag with the 
>>> encoding in the html-page. it doesn't do this in the http-header.
>>>
>>> therefore apache set the http-header to his standard-encoding and 
>>> "destroys" the correct encoding of the response, because most 
>>> browsers ignore the meta-tag if the http-header-encoding is set.
>>
>>
>>
>> Do you mean Apache or Tomcat?
>> Tomcat sets the http header encoding to the default ISO-8859-1, if 
>> none specified. You can also set the addDefaultCharset in Apache, but 
>> that won't make any difference I think, as there's already an encoding 
>> specified by Tomcat.
> 
> 
> ---> i mean apache.
> if i connect to tomcat directly theres no encoding set in the header, 
> just the mime-type (i use tomcat 4.1.12 and ieHTTPHeaders for inspecting 
> the header-data).
> therefore my page with russian characters displays correct, because the 
> browser uses the meta-tag in the html.
> 
> BUT if i connect via apache/jk2, apache uses the directive you mention 
> to set a default encoding - and the encoding is "destroyed". if tomcat 
> would set the header-encoding - you're absolutely right - apache would 
> do noting.
> 
> 
>>> so i need to set the http-header-encoding with cocoon and ask you 
>>> all, whats the best way to do this. are there any actions, 
>>> logicsheets or something else?
>>
>>
>>
>> ATM, Cocoon set the meta content-type tag with the mime-type and the 
>> encoding of the serializer. Furthermore, response.setContentType *is* 
>> called, which is one of the ways to set the http encoding header. But 
>> it is called with argument mime-type only, e.g. 
>> response.setContentType("text/html"), and we should be able to do 
>> response.setContentType("text/html; charset=utf-8").
>> Actually, you can, if you change the mime-type of the serializer in 
>> the sitemap from e.g. 'text/html' to 'text/html; charset=utf-8', 
>> you'll see that the http header encoding has changed to utf-8. Just an 
>> illustration of the problem, not a very nice solution :-)
> 
> 
> at the moment i'm happy by setting the apache-directive to utf-8 for my 
> virtual-host, but this is another good way for a temporary solution :-)
> 
>> We should be able to set the full content-type (with charset) in 
>> HttpEnvironment or in AbstractProcessingPipeline. I guess that 
>> involves changing at least the setSerializer(...) in 
>> AbstractProcessingPipeline and passing the encoding.
> 
> 
> i will try to open a bugzilla issue for this like bruno dumon wrote, but 
> first i need to search on the cocoon-page where i can do this :-)
> 
> thanks and greetings
> stefan
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
> For additional commands, e-mail: users-help@cocoon.apache.org
> 
> 
> 
> 

-- 
Jan Uyttenhove          jan.uyttenhove@xume.com	
 > Xume < - http://www.xume.com


Mime
View raw message