cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joerg Heinicke <>
Subject Re: Encoding problems, still!
Date Fri, 29 Oct 2004 08:42:49 GMT
On 29.10.2004 08:44, Tuomo L wrote:

>>> We're having some serious encoding problems. This happens only with 
>>> the @href attributes in html, when using characters like å, ä and ö 
>>> (in Finnish alphabet). Form encoding works just fine. I've gone 
>>> through all the threads concerning encoding (other people having 
>>> encoding problems too). No luck so far. Is this still an issue in 
>>> Cocoon? Could someone please tell what's wrong?
>> What's the page encoding? Forms work like expected? Just the links 
>> don't work? This normally points to a different page encoding than 
>> UTF-8 as link requests are encoded in UTF-8 while form requests are 
>> encoded in page encoding. I don't think it is a Cocoon issue.

First a link about all the encodings: (mostly written
by Bruno).

> According to IE, the page encoding is set to UTF-8. The
> container-encoding and form-encoding in web.xml (Tomcat) are set to UTF-8.

The container-encoding should not be touched at all and remain ISO-8859-1.

> HTMLSerializer is set to use UTF-8 (mime-type="text/html; charset=utf-8")
> and has the parameter <encoding>UTF-8</encoding>.

This should result in <meta http-equiv="Content-Type"
content="text/html;charset=utf-8">. The request encoding header should
have the same value ... what's not that easy when using a recent Tomcat:

> The xsl stylesheets use ISO-8859-1, though.

That's not a problem.

> I've also tried setting everything to ISO-8859-1, but
> the problem with the href-attributes in html remains. Mozilla Firefox
> shows the characters correctly when doing "view source", but if I save the
> document on disk and open with ASCII-editor, the encoding is wrong there
> with both IE and Mozilla. So maybe it's not a browser problem?
> Here's an example:
> <a href="äö" foo="äö">äö</a>
> becomes:
> <a href="%C3%A4%C3%B6" foo="&auml;&ouml;">&auml;&ouml;</a>
> when it should read (I think):
> <a href="&auml;&ouml;" foo="&auml;&ouml;">&auml;&ouml;</a>

follow-up mail:
> The URL-encoding is done wrong when serializing to HTML. According to
> specs "äö" should become "%E4%F6" when encoded, not "%C3%A4%C3%B6".
> This seems to be the problem. So far I've noticed this problem with
> the HREF-attribute only.
> For a test I made a styslesheet that substitutes "ä" with "%E4"
> before serializing to HTML. This works, but it should be done by the
> serializer, right?
> Seems like a Cocoon issue.

If it would be an error at all, it would be a Xalan serializer problem I
think. But there were bugs reported on this topic and rejected because
of the specs (I think they have the same problems like you):

As I wrote: you simply get different request encodings when sending a
form or just clicking <a href=""/>.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message