httpd-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier>
Subject Re: [users@httpd] unicode in basic auth
Date Fri, 17 Oct 2008 20:28:06 GMT
Milos Jakubicek wrote:
> Unfortunately it's not so easy, let's see what I'm getting when I send 
> the following two strings "ěščřžýáíéňďťúůó" "ĚŠČŘŽÝÁÍÉŇĎŤÚŮÓ"
> username and password (delimited with "-"):

No, it's not easy, specially when you may have 2 or 3 translation steps 
going on in-between that you don't know..

But can you maybe try with something shorter ? :-)
For example, try typing "aša" for the user-id, and no password.
That should be 3 bytes in iso-8859-2 (or the equivalent Windows 8-bit 
codepage); but in UTF-8, the "š" will be at least 2 bytes, so the total 
should be at least 4 bytes.

But before you go any further :
I don't know if you are ever going to be able to find a solution to this 
issue, using Basic Authentication.  Maybe there is simply no solution, 
for the reasons given below.
So, if you want to avoid spending a lot of time maybe for nothing, you 
should consider seriously if you cannot do authentication in another 
way, for example using an html login form. That's because there are 
solutions for that, which work for UTF-8, and it is not less secure than 
basic authentication (which is not secure at all).

Think that maybe also different browsers handle this differently, if it 
is not explicitly defined in the HTTP RFC (2616?).
That may be another reason to switch to login forms.

This being said, if you want to persist :

If you are using Firefox as a browser, I strongly recommend that you get 
an add-on like LiveHttpHeaders.  That will show you *exactly* what your 
browser is sending to the server as HTTP headers with each request.

On the server, you should install a module that dumps the request 
headers to a logfile, as received.  That should exist in the standard 
Apache series. Unfortunately, when the server writes to its logfile, 
there might be some translation/encoding going on at that moment, which 
might or might not be dependent on the locale the server is running 
under.  That would confuse things even more.

So let's say that you type "aša" as the userid, and click OK.
Your browser is going to take *whatever it understood*, format this into 
the right form for Basic authentication, encode this as a Base64 string, 
and send this as the content of a Basic Authentication HTTP header to 
the server.  With LiveHttpHeaders, you will see what it sends (the Base64).

Now you need a program that decodes that Base64 back into a decoded 
string of bytes, and look at what it really is (3 bytes, 4 bytes,..).

Anyway, the server will decode the Base64 string it gets in the header, 
pick out of it the user-id, and try the authentication with that.
I don't think that, at that point (after decoding the Base64 encoding), 
the server is going to decode the resulting string any further, from/to 
UTF-8 or whatever. It will just use the *bytes* it gets after decoding 
the Base64 string.

As Eric pointed out before, there are no mechanisms defined in HTTP to 
indicate an encoding *for the HTTP headers*. They are just supposed to 
be iso-8859-1 (and maybe even just US-ASCII). That's because HTTP was 
invented by a guy in Western Europe.  He was English, but luckily for us 
Western Europeans, he was working in Geneva at the time, so his pals 
probably told him that he needed to support at least German and French 
too. So iso-8859-1 is the default charset under HTTP (and HTML).  Still 
today.  UTF-8 is not the default, and it needs to be declared 
explicitly.  But there is no way to do that for HTTP headers. (The 
situation is just about as confused regarding URLs.)

As for your secondary question (cannot believe nobody had that problem 
before..) : a lot of people do still have problems with issues like that 
all the time.  There exist all kinds of tricks and receipes to avoid, or 
work around, similar issues with html forms, or with filenames.

Yours is the first time I have heard it in relation with Basic 
Authentication and that browser built-in login dialog.

The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:> for more info.
To unsubscribe, e-mail:
   "   from the digest:
For additional commands, e-mail:

View raw message