tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: Basic Authentication Failed with multibyte username
Date Fri, 22 Jan 2010 19:59:59 GMT
Christopher Schultz wrote:
> Hash: SHA1
> André,
> On 1/21/2010 6:35 PM, André Warnier wrote:
>> Basically, I would tend to say that if the server knows who the clients
>> are and vice-versa, you should be free to use any encoding you want,
>> with the limitation that what is exchanged on the wire conforms to HTTP
>> (because there may be proxies on the way which are not so tolerant).
> +1
>> What the client is sending is already (in a way) conformant to HTTP,
>> because it is base64 encoded and so, on the surface, it does not contain
>> non-ascii characters.
> +1
>> But the problem is that the standard Tomcat code which decodes the Basic
>> Authorization header does not work in the way you want, for these
>> illegal headers.
>> And this code should preferably not be changed in a way which breaks the
>> conformance with standard HTTP.
>> Because if you do that, then your Tomcat becomes useless for anything
>> else than your special client.
> +1
> Another possibility would be to use something like SecurityFilter, which
> allows you to (more easily) write your own authenticator and realm
> implementations, and you could write a BasicAuthenticator that reads
> these specially-formatted credentials.
> I checked the sf source, and it looks like we might have a bug:
>    private String decodeBasicAuthorizationString(String authorization) {
>       if (authorization == null ||
> !authorization.toLowerCase().startsWith("basic ")) {
>          return null;
>       } else {
>          authorization = authorization.substring(6).trim();
>          // Decode and parse the authorization credentials
>          return new String(Base64.decodeBase64(authorization.getBytes()));
>       }
>    }
> That "authorization.getBytes()" is just asking for trouble, because it
> uses the platform default encoding to convert characters to bytes. It
> should be using US-ASCII, ISO-8859-1, or something like that.

I don't think you have a problem there, because what you are decoding 
into bytes there IS bytes (it is base64-encoded).

> It also calls the String constructor with a byte array without
> specifying the encoding, therefore using the platform default.

That is indeed where you have a problem.  There you SHOULD always decode 
it as US-ASCII (or maybe iso-8859-1, I'm not quite sure what the spec 
says exactly).

Let's say that the spec is clear and says that the header value is 
*TEXT, and that *TEXT is always US-ASCII (or ISO-8859-1) by default.

Let's take it from the browser side first.
If the "userid:password" is indeed composed only of us-ascii characters, 
then the browser base64-encodes this directly and it is trivial.(*)

But let's say that "userid:password" is something else than us-ascii.
Another part of the spec says that then, you have to encode it according 
to RFC2047.
My contention is then that the browser should first RFC2047-encode 
"userid:password", and then base64-encode the result.

Back on the server side.
The server base64-decodes the authorization token, into an ascii string.
It can do that always, because either the string was ascii to start 
with, or else it was not, but then it has been RFC2047-encoded, yelding 
a result that is ascii.
(like : =?iso-8859-2?B?....base64-encoded stuff...?= )

Then the server must do another round of decoding via RFC2047.
That consists of a double decoding again : base64-decode the string 
between the ?? into bytes, and then decode those bytes into Unicode, 
using the charset indicated at the beginning of the rfc2047-encoded 

The above, I believe, would be totally consistent with the current RFCs.

But there is a major catch : I don't believe that there is a browser on 
the market today, which "properly" encodes the "userid:password" string 
via rfc2047 when it isn't ascii.

And the OP's special client sends UTF-8, but also does not 
rfc2047-encode it.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message