tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <ch...@christopherschultz.net>
Subject Re: mod_jk codepage in header values
Date Mon, 25 Jan 2010 19:33:37 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Mirko,

On 1/25/2010 4:24 AM, Mirko Solic wrote:
> On Thu, 2010-01-21 at 10:34 -0500, Christopher Schultz wrote:
>> What would be better is to do something like this:
>>
>> java.net.URLEncoder.encode(request.getHeader(headerName), "UTF-8")
>>
>> Of course, this will only work if your client knows that's how the
>> encoding will be done.
> 
> Yes but what if mod_jk chooses not to send non ISO-8859-1 header values
> over to tomcat side.

This is simply not mod_jk's job: mod_jk pretty much delivers the exact
bytes sent by the client. Trust me: it's better that way.

> According to André Warnier:
>
>> But, because the HTTP RFC specifies that HTTP headers 
>> should contain only US-ASCII character data, mod_jk would be allowed,
>> if 
>> it finds non-US-ASCII data in a HTTP header, to strip this data or 
>> ignore the header or something like that.  I don't know if mod_jk 
>> actually does this, but if it did, it would be justified, because 
>> according to the HTTP RFC this would be an invalid header.
> 
> Than i have no values to decode to.

I can tell you there's no reason for mod_jk to do this, and I don't
believe it does, for the testing I have performed does not demonstrate
that behavior.

>> AAI needs to support whatever encoding you intend to use. You can't
>> simply transcode things in an arbitrary way and expect AAI to work
>> properly. What does their documentation say about what format these
>> values should take?
> 
> The problem is when i want to get data from AAI. AAI is sending data in
> utf-8 but this is broken when data is send from apache side to tomcat
> side.

So, the bytes are being sent as UTF-8 instead of US-ASCII. I think
you're back to where we started: re-encoding strings. It's possible that
you may run into a situation where the re-encoding is simply going to
fail because of how badly the string has been damaged by an incorrect
decoding. Maybe that's not an issue with ISO-8859-1 (at least it's a
1-byte encoding and all bytes are ostensibly legal).

>> A better strategy would be for AAI to provide a numeric token (easily
>> passable in HTTP headers without any encoding issues) and then provide
>> an HTTP-based and/or XML-based API that uses proper document encoding to
>> send textual data across the wire.
>>
>> Using HTTP headers for text data sucks!
> 
> I agree with you here: Using HTTP headers for text data sucks!. But AAI
> is not supported on tomcat yet. However it is supported on apache and
> the only way for me if i want to use AAI and tomcat is to use mod_jk
> connector. But mod_jk is transporting environment variables from apache
> to tomcat in HTTP header.

That sounds like an AAI bug, not an httpd/mod_jk/Tomcat bug: mod_jk
sends environment variables as request /attributes/, not request
headers. (See the "JkEnvVar" directive in
http://tomcat.apache.org/connectors-doc/reference/apache.html). If AAI
is creating new request headers, it's AAI's fault for incorrectly
formatting them. If you can get this data from a request /attribute/
instead, then maybe that's a better option (though there are no
references to character encoding in the documentation for JkEnvVar).

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktd8hEACgkQ9CaO5/Lv0PBCdACfXGvpCFULt8Cs49xeQjdv+Rwz
2oAAmgNUr3WdHwRJ9T9x5XS+Jx3PkU7c
=tG4b
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message