tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christopher Schultz <>
Subject Re: [slightly OT] FORM based authentication and utf-8 encoding of credentials
Date Wed, 26 Jun 2013 13:58:00 GMT
Hash: SHA256


On 6/26/13 8:01 AM, André Warnier wrote:
> Jan Vávra wrote:
>> Hello,
>>>>> When I create user with password with czech String
>>>>> "ŽežUlička.1" the browser sends correctly this string as:
>>>>> POST http://localhost:70/myapp/j_security_check HTTP/1.1 
>>>>> Content-Type: application/x-www-form-urlencoded
>>>>> j_username=p&j_password=%C5%BDe%C5%BEUli%C4%8Dka.1
>>> The browser is not sending that correctly. The password is
>>> UTF-8 encoded but the Content-Type fails to specify the
>>> character set used. It it did, Tomcat would treat the password
>>> as UTF-8.
>>> This is a common failing of browsers and is covered in the FAQ.
>>> [1]
>> Well I have tried IE, Firefox, Chrome. None of them is appending 
>> charset in Content-Type. I have manually modified the request
>> header to: Content-Type: application/x-www-form-urlencoded;
>> charset=utf-8 and Tomcat gives me the letters in the correct
>> form. Ok, good to know.
>>>>> Any idea how to tell tomcat to use utf-8 in form based
>>>>> authentication? It's tomcat 7.0.34 on Czech Windows 7 32
>>>>> bit with default ansi code page set as Windows-1250.
>>> Authentication is tricky because the processing happens before
>>> any user code runs. The best / only option is to set the
>>> characterEncoding attribute for the Authenticator [2] to UTF-8
>>> and hope that the browsers are consistent in their failing to
>>> follow the specification and use whatever encoding the page is
>>> encoded with.
>>> HTH,
>>> Mark
>>> [1] [2] 
As you have referred in [2] I have added to my app's context xml
>> <Valve
>> className="org.apache.catalina.authenticator.FormAuthenticator" 
>> characterEncoding="utf-8"/> and Czech letters are in the correct
>> form. This is a solution.
>> Thanks for an advice.
> By the way, referring to this basic failing of browsers : this is 
> something that is clearly contrary to the specs, yet since years
> all major browsers have consistently ignored this issue. This
> failure of adding the character set/encoding to HTTP POST's is 
> causing problems in multi-lingual web applications, and by itself 
> forcing multiple workarounds which themselves are per force
> inconsistent. Does anyone have an idea why browsers keep on
> ignoring this issue, version after version ?
> (I would imagine that Apache httpd and Tomcat devs must have
> regular contacts with whomever develop browsers, so did anyone ever
> ask ?)

Here's the long-standing bug in Mozilla:

...and one referenced from it with better spec-research:

The bottom line is two specific items:
1. Content-Type should only have a "charset" attribute for "text/" types
2. Adding "charset" to Content-Type for application/x-form-urlencoded
has broken some stuff in the past, so Mozilla has chosen not to
re-enable it

- -chris
Version: GnuPG/MacGPG2 v2.0.17 (Darwin)
Comment: GPGTools -
Comment: Using GnuPG with Thunderbird -


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message