cocoon-docs mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From d...@cocoon.apache.org
Subject [Cocoon Wiki] Updated: RequestParameterEncoding
Date Tue, 03 Aug 2004 08:37:20 GMT
   Date: 2004-08-03T01:37:19
   Editor: ViPi <michbur@arcor.de>
   Wiki: Cocoon Wiki
   Page: RequestParameterEncoding
   URL: http://wiki.apache.org/cocoon/RequestParameterEncoding

   no comment

Change Log:

------------------------------------------------------------------------------
@@ -10,13 +10,13 @@
 
 First of all, check in the sitemap what encoding is used when serializing HTML pages:
 
-{{{
-<map:serializer logger="sitemap.serializer.html" mime-type="text/html"
-       name="html" pool-grow="4" pool-max="32" pool-min="4"
-       src="org.apache.cocoon.serialization.HTMLSerializer">
-  <buffer-size>1024</buffer-size>
-  <encoding>UTF-8</encoding>
-</map:serializer>
+{{{
+<map:serializer logger="sitemap.serializer.html" mime-type="text/html"
+       name="html" pool-grow="4" pool-max="32" pool-min="4"
+       src="org.apache.cocoon.serialization.HTMLSerializer">
+  <buffer-size>1024</buffer-size>
+  <encoding>UTF-8</encoding>
+</map:serializer>
 }}}
 
 In the example above, UTF-8 is the encoding used. This is a widely supported Unicode encoding,
so it is often a good choice.
@@ -24,34 +24,35 @@
 The HTML serializer will automatically insert a <meta> tag into the HTML page's HEAD
element specifying the encoding. Most browsers apparently require this. The HTML serializer
will however only do this if your page already
 contains a HEAD (or head) element, so make sure it has one. The <meta> element inserted
by the serializer will then look as follows:
 
-{{{
-<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+{{{
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
 }}}
 
 Mozilla (tested with 1.4), netscape 7.1 and Internet Explorer 6 will not respond to the setting
of this meta tag, whereas they do respond to the http response header "Content-Type". So you
may have to subclass
 the HTMLSerializer and let it add this header in order to get Mozilla and IE working.[[BR]]
 -- ''Someone added this last paragraph here. Good advice (haven't found time to verify it
yet though), but if this is the case we should fix this in Cocoon. Patches welcome in bugzilla.
(BrunoDumon).''[[BR]]
--- ''I can confirm it and the effect is obvious when using a recent Tomcat (> 4.1.27):
[http://issues.apache.org/bugzilla/show_bug.cgi?id=26997 Bug #26997]. But AFAIK the above
must read 'will not respond to the setting of this meta tag '''if''' the encoding/charset
in the "Content-Type" header is set' and Cocoon's problem is, that it does not set the encoding/charset
and the recent Tomcats sets it to default ISO-8859-1. (JoergHeinicke)''
+-- ''I can confirm it and the effect is obvious when using a recent Tomcat (> 4.1.27):
[http://issues.apache.org/bugzilla/show_bug.cgi?id=26997 Bug #26997]. But AFAIK the above
must read 'will not respond to the setting of this meta tag '''if''' the encoding/charset
in the "Content-Type" header is set' and Cocoon's problem is, that it does not set the encoding/charset
and the recent Tomcats sets it to default ISO-8859-1. (JoergHeinicke)''[[BR]]
+-- ''When HTML serializer is configured {{{<encoding>UTF-8</encoding>}}} to output
UTF-8 then it should also use the appropriate setting for the HTTP header {{{mime-type="text/html;
charset=utf-8"}}} to send the correct information to the browser. (Volkmar W. Pogatzki)''
 
 By default, if the browser doesn't explicitely mention the encoding, a servlet container
will decode request parameters using the ISO-8859-1 encoding (independent of the platform
on which the container is running). So in the above case where UTF-8 was used when serializing,
we would be facing problems.
 
 The encoding to use when decoding request parameters can be configured in the web.xml by
supplying init parameters called "form-encoding" and "container-encoding" to the Cocoon servlet.
The container-encoding parameter indicates according to what encoding the container tried
to decode the request parameters (normally ISO-8859-1), and the form-encoding parameter indicates
the actual encoding. Here's an example of how to specify the parameters in the web.xml:
 
-{{{
-<init-param>
-  <param-name>container-encoding</param-name>
-  <param-value>ISO-8859-1</param-value>
-</init-param>
-<init-param>
-  <param-name>form-encoding</param-name>
-  <param-value>UTF-8</param-value>
-</init-param>
+{{{
+<init-param>
+  <param-name>container-encoding</param-name>
+  <param-value>ISO-8859-1</param-value>
+</init-param>
+<init-param>
+  <param-name>form-encoding</param-name>
+  <param-value>UTF-8</param-value>
+</init-param>
 }}}
 
 For Java-insiders: what Cocoon actually does internally is apply the following trick to get
a parameter correctly decoded: suppose "value" is a string containing a request parameter,
then Cocoon will do:
 
-{{{
-value = new String(value.getBytes("ISO-8859-1"), "UTF-8");
+{{{
+value = new String(value.getBytes("ISO-8859-1"), "UTF-8");
 }}}
 
 So it recodes the incorrectly decoded string back to bytes and decodes it using the correct
encoding.
@@ -61,15 +62,15 @@
 Cocoon is ideally suited for publishing to different kinds of devices, and it may well be
possible that for certain devices, it is required to use different encodings.  In this case,
you can redefine the form-encoding for specific pipelines using the !SetCharacterEncodingAction.
 
 To use it, first of all make sure the action is declared in the map:actions element of the
sitemap:
-{{{
-<map:action name="set-encoding" src="org.apache.cocoon.acting.SetCharacterEncodingAction"/>
+{{{
+<map:action name="set-encoding" src="org.apache.cocoon.acting.SetCharacterEncodingAction"/>
 }}}
 
 and then call the action at the required location as follows:
-{{{
-<map:act type="set-encoding">
-  <map:parameter name="form-encoding" value="some-other-encoding"/>
-</map:act>
+{{{
+<map:act type="set-encoding">
+  <map:parameter name="form-encoding" value="some-other-encoding"/>
+</map:act>
 }}}
 
 == Problems with components using the original !HttpServletRequest (JSPGenerator, ...) ==
@@ -82,28 +83,28 @@
 
 Now modify your webapp's web.xml file to include the following (after the display-name and
description elements, but before the servlet element):
 
-{{{
-<filter>
-  <filter-name>Set Character Encoding</filter-name>
-  <filter-class>filters.SetCharacterEncodingFilter</filter-class>
-  <init-param>
-    <param-name>encoding</param-name>
-    <param-value>UTF-8</param-value>
-  </init-param>
-</filter>
-
-<filter-mapping>
-  <filter-name>Set Character Encoding</filter-name>
-  <url-pattern>/*</url-pattern>
-</filter-mapping>
+{{{
+<filter>
+  <filter-name>Set Character Encoding</filter-name>
+  <filter-class>filters.SetCharacterEncodingFilter</filter-class>
+  <init-param>
+    <param-name>encoding</param-name>
+    <param-value>UTF-8</param-value>
+  </init-param>
+</filter>
+
+<filter-mapping>
+  <filter-name>Set Character Encoding</filter-name>
+  <url-pattern>/*</url-pattern>
+</filter-mapping>
 }}}
 
 Since the filter element is new in the servlet 2.3 specification, you might need to modify
the DOCTYPE declaration in the web.xml:
 
-{{{
-<!DOCTYPE web-app
-    PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
-    "http://java.sun.com/dtd/web-app_2_3.dtd">
+{{{
+<!DOCTYPE web-app
+    PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN"
+    "http://java.sun.com/dtd/web-app_2_3.dtd">
 }}}
 
 Of course, when using a servlet filter to set the encoding, you should not supply the form-encoding
init parameter anymore in the web.xml. You could still supply the container-encoding parameter,
though its value will now have to be the same as the encoding supplied to the filter. This
will allow you to override the form-encoding using the !SetCharacterEncodingAction, though
only for the Cocoon Request object.

Mime
View raw message