tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Thomas <>
Subject Re: Basic Authentication Failed with multibyte username
Date Thu, 21 Jan 2010 13:11:30 GMT
On 21/01/2010 06:55, André Warnier wrote:
> Mark Thomas wrote:
>> The authorisation header is base64
>> encoded so it is automatically compliant with RFC2616.
> Yes, it sounds like you're right; my mistake.
> (Also for Gabor, I admit my mistake.)
> I agree that the HTTP header itself is correct.
> But there is still somethig which puzzles me in the absolute.
> Suppose that the browser and the server know nothing particular about
> one another, and that the server gets such an Authentication header from
> the browser.
> The Base64 decoding is done, and yields a series of bytes.
> Now this series of bytes have to be interpreted, to be translated into a
> string in Java (which is Unicode).  Which encoding should be chosen to
> decode the byte array ?
> If you use the default platform JVM encoding, you are making the
> assumption that the browser knew what this encoding is, aren't you ?
> On the other hand, the browser sent nothing to indicate in which
> encoding this string was, before it encoded it using Base64, or did it ?

RFC2617 to the rescue...

      basic-credentials = base64-user-pass
      base64-user-pass  = <base64 [4] encoding of user-pass,
                          except not limited to 76 char/line>
      user-pass         = userid ":" password
      userid            = *<TEXT excluding ":">
      password          = *TEXT

*TEXT is defined in RFC2616

       TEXT           = <any OCTET except CTLs,
                        but including LWS>

and finally

       OCTET          = <any 8-bit sequence of data>
       CTL            = <any US-ASCII control character
                        (octets 0 - 31) and DEL (127)>

So actually, Tomcat is correct in the current treatment of credentials.
Therefore, not a bug.

Also André's comments regarding ISO-8859-1 were right if considering the
actual user name and password rather than the header.

Supporting other encodings would be a useful enhancement but the default
will have to be ISO-8859-1 to remain spec compliant. What the browsers
will do for user names and passwords in other encodings is not defined
so it will be a case of YMMV.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message