cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Portier <>
Subject Re: [Help]How can I use non-ascii file name?
Date Tue, 17 Aug 2004 15:20:11 GMT
(repost: just noticed I forgot to copy dev-list)

Pier Fumagalli wrote:

> On 16 Aug 2004, at 14:02, Pier Fumagalli wrote:
>> I'll see why this happens in Jetty, I'll poke Jen and Greg to have 
>> either a fix, or an explaination and workaround... For now, brrrr, I 
>> think that the hack is the only way to go...
> I don't know about Tomcat, but if you're not on the jetty developers 
> list, here's the outcome:

I'm not, thx for copying over...

> Jetty defaults (for compatibility to all the other broken containers, 
> and because there's no "official standard" about UTF-8 URIs) to 
> ISO-8859-1. And this ain't great.
> Now, the good thing is that if you start your jetty specifying the 
> "org.mortbay.util.URI.charset" system property, it will use that one as 
> the charset used for decoding URLs.
> So, by putting in "-Dorg.mortbay.util.URI.charset=UTF-8" we get the 
> expected behavior.


> How about setting it up as the default behavior for Cocoon's internal 
> Jetty distro?

makes sense, but: (whishing all this brokenness wan't there but helas)

- it shouldn't keep us from actually get about solving it for all
containers? (my guess is that just a fraction of cocoon deployments
actually run on the internal jetty distro, i.e. using the or

- learning about this org.mortbay.util.URI.charset property we should
probably use it to override (or at least log-warn deployers if it's
different to) the container-encoding setting in the web.xml
(assuming that the mentioned property will also be in effect when
decoding the request parameters, and taking in account that current
cocoon code assumes ISO-8859-1 as the default there)

- once we've run that far, we might even consider making a scan of other
servlet containers and how they possibly allow setting the


while typing I started rethinking why we ended up with this
container-encoding init-param in web.xml?

IIRC we did that because of required compliance to servlet spec versions
prior to 2.3?  So first question is are we still on servlet 2.2?

If not: Since 2.3 there exists a setCharacterEncoding()
<quote from="servlet 2.3 javadoc"
   Overrides the name of the character encoding used in the body of this
   request. This method must be called prior to reading request
   parameters or reading input using getReader().

- I assume the cocoon servlet could easily arrange for calling the
method before anything else
- I'm a bit unsure here if the javadoc mentioning of 'in the body of
this request' is going to be interpreted by implementations as a
limiting scope, and if so if they include the URI (and the request
params using get vs post) as part of it or not

(talk about possible confusion when writing specs like this, yuk!)

(sorry for just popping up the questions, lacking the time to
investigate deeper myself ATM)
Marc Portier                  
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at                          

View raw message