Please let me know if this is the right place to post this.
There seems to be a problem in Cocoon regarding URL encoding. In this
case we're using the authentication framework which internally calls a
pipeline to do the authentication which in turn uses a generator we have
written. Within the generator, we're accessing values of the parameters
object passed in via public void setup(SourceResolver resolver, Map
objectModel, String src, Parameters parameters).
In this case (and the problem really seems to be limited to pipelines
called internally within cocoon as done by the authentication framework)
german umlauts are not handled correctly. The problem basically is that
Cocoon at some point encodes the umlauts
buf.append(resourceParameters.getEncodedQueryString());
(org.apache.cocoon.components.source.SourceUtil, Line 598)
and at a later point, before calling the setup(...) of the generator
again decodes them in
org.apache.cocoon.environment.wrapper.RequestParameters.java, method
private String parseName(String s).
When encoding, an Umlaut like ä is encoded as %C3%A4 (as 2 characters,
which seems to be the UTF-8 encoding that is also returned when
executing java.net.URLEncoder.encode("ä", "UTF-8")).
When decoding %C3%A4 using the parseName method, this produces garbage
instead of the original ä umlaut. This is due to the handling of escape
characters in parseName which does not support a 2-charachter-encoding
because of
case '%':
try {
sb.append((char)
Integer.parseInt(s.substring(i+1, i+3),
16));
i += 2;
which treats each %xx as a unique character and doesn't handle the case
where %xx%yy is actually one character. Looks like the
getEncodedQueryString and the parseName do not work with the same
encoding scheme or more like the parseName implicitely assumes a certain
encoding scheme which is not UTF-8.
Any ideas?
Best
Ralph
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
|