httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Ramshaw" <mrams...@alumni.concordia.ca>
Subject Re: [1.3 PATCH/QUESTION] Win32 ap_os_is_filename_valid()
Date Thu, 14 Mar 2002 06:29:47 GMT
>Apache 1.3 on Win32 assumes that the names of files served are
>comprised solely of characters from character sets which are a superset
>of ASCII, such as UTF-8 or ISO-8859-1.

Umm, I assume that ASCII as you refer to it is its 7-bit incarnation.

Its 8-bit incarnation is referred to as US-ASCII, ANSI-X3.1968 (or some
such), or more commonly ISO-8859-1 (also called latin-1).

Note that _all_ character sets are supersets of 7-bit ASCII, and most
are supersets of 8-bit ASCII (the exceptions being the various other
'latin' encodings - i.e. ISO-8859-2 through ISO-8859-16 which differ
in the various 'special' characters).

This has the lovely side-effect that English is always an option,
regardless of the actual encoding being used.

The difference is in the amount and position of the bytes. Wide character
sets will pad to the left or right (depending on whether little-endian or
big-endian order is in effect) with a null character. As well, various byte-
order markers may be present (ffef, feff, etc).

> It has no logic to determine whether or not a possible file name contains
> invalid characters.  It has no logic to properly match actual non-ASCII
file
> names with names specified in the Apache configuration file.  Because
> Apache does not verify that the characters in file names are all from a
valid
> character set, files containing various invalid characters in their names
can
> be successfully served by Apache.

I think you've missed the boat on this one. Asian versions of Windows will
all probably use characters that you don't consider as ASCII (i.e. they will
be wide - actually Microsoft have done a pretty good job of this). I think
what you meant to say was that only English (and, come to think of it, only
English versions of Windows) is recommended - with wide character
support promised in Apache 2.0.

    Regards



Mime
View raw message