cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From christian bindeballe <>
Subject Re: XML-Serializer encoding
Date Mon, 16 Jan 2006 18:17:31 GMT
Hello Marc,

Marc Portier schrieb:
> never change your container-encoding unless you have a servlet container
> of which you can specify the used encoding applied in decoding of url's
> and request parameters
> (if you don't understand what I just said: that translates to simply
> "never")

I think I got it :) It also said that in the comments of the web.xml - 
file, as to never change it unless the servlet-container is buggy (which 
I suppose Tomcat 5.0.28 is not), but I thought I might give it a shot. 
But since that didn't help I changed it back to the original setting

> like where? I just did a rough scan but couldn't find any 'multiple byte
> for single character' occurances

OK, so I belive I got something wrong. These characters that I thought 
to be Unicode-Characters are rather XML-Interpretations?
There are often Chars like &#8221; in the feeds. Since these aren't 
translated properly and they are not part of Latin-1 I thought they must 
be UTF-8, which they obviously aren't, or are they?

>>$ wget -q -O -
| grep '&#'
> are all punctuation chars that seem to be correctly applied

see above :) you're more than probably right

> I have never used coplets, nor even looked at them (deeply sorry)
> but I would certainly check the way these feeds are interpreted in the
> first place (rather then how they are serialized)
> if that is bad, then nothing furtheron in the pipe will be able to
> produce decent characterstreams regardless of encoding scheme's you're
> trying out on the serializer

This is the relevant part of my sitemap:
<map:match pattern="live.rss">
             <map:generate type="file" src="{request-param:feed}" 
label="content" />
             <map:transform type="xslt" src="styles/rss2html.xsl">
                 <map:parameter name="fullscreen" 
             <map:serialize type="xml"/>

So my next thought was that it is the XSL that is messing up the RSS.
So I edited the XSL and added this line after the <xsl:stylesheet>

<xsl:output method="html" encoding="ISO-8859-1"/>

but it didn't help either. Maybe someone would like to take a look at 
the xsl I attached to see whether there is something wrong with it?

> on the side: you don't need to set your serializer specific encoding if
> you have set the form-encoding init param in the web.xml to utf-8 (which
> I would suggest at all times)


and thanks a lot for your effort, everybody. I really appreciate that :)

best regards,


View raw message