cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ralph (JIRA)" <j...@apache.org>
Subject [jira] Created: (COCOON-2263) Problem in authentication framework when using umlauts or other language specific characters
Date Wed, 22 Jul 2009 18:27:15 GMT
Problem in authentication framework when using umlauts or other language specific characters
--------------------------------------------------------------------------------------------

                 Key: COCOON-2263
                 URL: https://issues.apache.org/jira/browse/COCOON-2263
             Project: Cocoon
          Issue Type: Bug
          Components: * Cocoon Core
    Affects Versions: 2.1.9
            Reporter: Ralph


There is a problem in Cocoon regarding URL encoding. We saw this problem on a Cocoon 2.1.9,
however, this likely applies to Cocoon 2.1.11 + 2.2 versions as well, as the relevant code
parts are identical.

What happens is the following:

We're using the authentication framework which internally calls a pipeline to do the authentication
which in turn uses a generator we have written. Within the generator, we're accessing values
of the parameters object passed in via public void setup(SourceResolver resolver, Map objectModel,
String src, Parameters parameters).

In this case (and the problem really seems to be limited to pipelines called internally within
cocoon as done by the authentication framework) german umlauts and other language specific
characters like for the russian, ...  language are not handled correctly when getting values
from the parameters object. The problem basically is that Cocoon at some point during authentication
URL encodes the umlauts in

buf.append(resourceParameters.getEncodedQueryString()); 
-> org.apache.cocoon.components.source.SourceUtil.java, Line 598

and at a later point, before calling the setup(...) of the generator again decodes them in

org.apache.cocoon.environment.wrapper.RequestParameters.java, method private String parseName(String
s).

During encoding, an Umlaut like ä is encoded as %C3%A4 (this is 2 (!) characters, which seems
to be the UTF-8 encoding that is also returned when executing java.net.URLEncoder.encode("ä",
"UTF-8")).

When later on Cocoon decodes %C3%A4 using the parseName method though, this produces garbage
instead of the original ä umlaut. This is due to the handling of escape characters in parseName
which does not support a 2-charachter-encoding because of

               case '%':
                   try {
                       sb.append((char) Integer.parseInt(s.substring(i+1, i+3),
                             16));
                       i += 2;
-> org.apache.cocoon.environment.wrapper.RequestParameters.java, starting line 43

This code treats each %xx as a unique character and doesn't handle the case where %xx%yy actually
represents only one character. Looks like the getEncodedQueryString and the parseName do not
work with the same encoding scheme or more like the parseName implicitely assumes a certain
encoding scheme which is not UTF-8. 

In this case, the authentication will fail as the data passed in to the generator within the
authentication pipeline has been corrupted. This might generally apply to other situations
as well where pipelines are called internally within Cocoon.

Let me know if you need any more information.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message