httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "William A. Rowe, Jr." <wr...@rowe-clan.net>
Subject RE: Filesystem utf-8 i18n?
Date Tue, 03 Oct 2000 00:47:44 GMT
> From: Fielding, Roy [mailto:fielding@eBuilt.com]
> Sent: Monday, October 02, 2000 7:24 PM
> 
> In general, it is generally a bad idea to allow too many URLs to map
> to the same resource within the server -- that is when security holes
> and excessive spidering becomes a problem.  As long as you can pick
> a canonical form and externally redirect non-canonical versions 
> of the URL to the canonical form in a computationally reasonable 
> fashion, I'm happy.

I'm envisioning two canonical levels:

*) Pure; that is strcmp(fspec|path, url) - this makes Apache/Win32
   quite case sensitive, perhaps frustrating, but the most secure.
   No equivilance for short names whatsoever.

*) Restrictive; same as Semipure below, but followed by an external
   redirect to the Pure result if a Pure test fails.

*) Semipure; here is what we are doing today, accept any path spec,
   but convert to a pure (case insensitive and all) for any
   comparison and permission testing.  Whatever NT accepts as
   legit is legit, but the proper name retranslated back from
   unicode to utf-8 will be tested.  Shortnames are permitted.
   The user is not given any feedback to the true name of the
   file, and no, I don't like this any more than you do.

All but pure would allow a two character umlaut - U combo, while 
the pure will insist on the precise encoded on the file system.

This will be a server-wide config directive, when I'm done (ok...
perhaps by directory).  That's why you haven't seen a forward port 
of any of the canonical stuff from 1.3.x yet.


Mime
View raw message