cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Portier <>
Subject Re: container-encoding vs. form-encoding... bug?
Date Mon, 23 Aug 2004 07:07:49 GMT
this wiki article should explain everything:

Mark Lundquist wrote:

> Hi,
> I'm using Cocoon w/ Jetty 4.2.15.  xalan was throwing a 
> SAXException trying to write a character (U2026, &hellip) that's not 
> reppresentable "in the specified output encoding iso-8859-1".

probably somewhere in the serializer

> I made sure I had <xml:output encoding="UTF-8"> everywhere, but the 

to no avail (and I assume you wanted to type <xsl:output ...>  )

this directive is used by xalan if the 'xalan engine' is operating in a 
mode where it needs to transform AND serialize

cocoon however (having it's reasons to separate the two operations) will 
override this line in the xsl anyway... for cocoon the end result of a 
transformer needs to be pure sax-events that will be piped through a 
serializer later on

Since cocoon overrides that anyway you should use the <xsl:output ...> 
in your stylesheets to ease your debugging work so you can see the 
output of your stylesheet in your favourite encoding (and whatnot output 

(for API geeks, see:

> problem persisted.  Finally I figured out that I needed to check the 
> encoding parameters in web.xml.  Sure enough, container-encoding and 
> form-encoding were not set, and the comments indicate that they default 
> to iso-8859-1.
> So I set the container-encoding to UTF-8, and that didn't have any 
> effect.  Only when I set form-encoding to UTF-8 did my problem go away. 

container-encoding should be set to the encoding your chosen container 
(jetty) is using to decode (the body of) HTTP-requests

most container take iso-8859-1 here, so you should just leave it unless 
you know about your container doin' it differently

recent post learned that Jetty will allow you to set it yourself by 
specifying a system property -Dorg.mortbay.util.URI.charset=UTF-8
(see: )

so only when playing with this, you should be getting into changing the 
container encoding in the web.xml

>  The thing is, the character that was causing the problem isn't coming 
> from the request!  I expected container-encoding to be the one that 
> would effect the behavior I was seeing.

as you found out by now
container-encoding setting only comes into play when HTTP-request's body 
  is read in some way

> So, am I just not understanding something correctly?  Or is it a bug, 
> and if so is it a problem with Cocoon or with Jetty?

what really needs to happen in this story is telling the SERIALIZER in 
cocoon about what encoding to use

it's quite logic: the <xsl:output ...> directive is overriden from the 
transformer part, so we need to inject that info back again, since this 
is about the serialization part of things you should give that info to 
the serializer. So how do you do that?

1/ You do that on a local level (one serializer) by applying the hints 
Jan just gave in his post. (setting map:serializer/@mime-type and 

2/ You do that on a global level (default for all text-serializers) by 
doing what you did: setting the form-encoding in web.xml.

Historically that setting comes into play also in the area of 
request-paramaters.  However there is a 'bug' (well, maybe rather a 
'historic way of interpreting' the specs) in most browsers that will 
make them apply the same form-encoding to their requests as the one 
applicable to the form asking for the reques-parameters.  Because of 
this client-side coupling we opted to make the applied form-encoding 
also be the default for our serializers.

Marc Portier                  
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at                          

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message