cocoon-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ralph Rauscher <...@blue-elephant-systems.com>
Subject Problem in authentication framework when using umlauts
Date Tue, 07 Jul 2009 19:20:38 GMT
Please let me know if this is the right place to post this.

There seems to be a problem in Cocoon regarding URL encoding. In this 
case we're using the authentication framework which internally calls a 
pipeline to do the authentication which in turn uses a generator we have 
written. Within the generator, we're accessing values of the parameters 
object passed in via public void setup(SourceResolver resolver, Map 
objectModel, String src, Parameters parameters).

In this case (and the problem really seems to be limited to pipelines 
called internally within cocoon as done by the authentication framework) 
german umlauts are not handled correctly. The problem basically is that 
Cocoon at some point encodes the umlauts

buf.append(resourceParameters.getEncodedQueryString()); 
(org.apache.cocoon.components.source.SourceUtil, Line 598)

and at a later point, before calling the setup(...) of the generator 
again decodes them in

org.apache.cocoon.environment.wrapper.RequestParameters.java, method 
private String parseName(String s).

When encoding, an Umlaut like ä is encoded as %C3%A4 (as 2 characters, 
which seems to be the UTF-8 encoding that is also returned when 
executing java.net.URLEncoder.encode("ä", "UTF-8")).

When decoding %C3%A4 using the parseName method, this produces garbage 
instead of the original ä umlaut. This is due to the handling of escape 
characters in parseName which does not support a 2-charachter-encoding 
because of

                case '%':
                    try {
                        sb.append((char) 
Integer.parseInt(s.substring(i+1, i+3),
                              16));
                        i += 2;

which treats each %xx as a unique character and doesn't handle the case 
where %xx%yy is actually one character. Looks like the 
getEncodedQueryString and the parseName do not work with the same 
encoding scheme or more like the parseName implicitely assumes a certain 
encoding scheme which is not UTF-8.

Any ideas?

Best
    Ralph


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org


Mime
View raw message