httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ralf S. Engelschall" <...@en.muc.de>
Subject Re: mod_rewrite v2.2-SNAP
Date Wed, 07 Aug 1996 10:38:40 GMT
On 7 Aug 1996 03:09:51 +0200 in en.lists.apache-new-httpd you wrote:
> On Tue, 6 Aug 1996, Ralf S. Engelschall wrote:

> [...]
> Anyhow, I took a look at v2.2-SNAP of mod_rewrite. I had to make a slight
> change to mod_rewrite.h to get it to compile:
> [...forgotten regex swith...]

Fixed. Thanks. I just tested the latest v2.2-SNAP with 1.1 and here it was
ok. Hmmm..

> [...]
> A qualifier, though: I find the behavior of RewriteRule in .htaccess
> files, while somewhat consistent, rather confusing, and just plain Wrong
> in cases. Here's what it does, as near as I can tell (this is based on
> behavior, not on the code):

> It takes the filename (not the URI, as RewriteRule in server config
> files do), and strips off part of it. If the document is with the
> documentroot, it strips off just that, else it strips off everything. In
> other words, if /www/htdocs is the DocumentRoot, and you have an .htaccess
> file in /www/htdocs/abc, it strips of "/www/htdocs/" from the front. If
> you have a filename in /abc/def, it strips off "/abc/def/" from the front.

Correct. I reviewed my code and I didn't know why I did the extra stuff for
document_root, but there were some problems in the past without this stuff,
so I added it. Hmnmm.. I just removed it and it worked fine and better, i.e.
the problem you described is fixed. 

> If it then matches, it takes the substituted string, and adds back on what
> it took off. It then tries to issue an internal redirect to that string.
> in other words, if in /abc/def/.htaccess, I have "RewriteRule ^one two",
> it will issue an internal redirect to the URL /abc/def/two. This is
> entirely wrong. 100%. Let's say I had "Alias /xyz /abc/def" in my config
> files. There is no url /abc/def/two. It should issue an internal redirect
> to /xyz/two.

Hmmm.. .correct! This never occured on my test machines, because we have very
related URLs and filenames. But the problem you describe is a really horrible
one, because their are no trivial solutions...

> The proper behavior here, as I've said before, is to never look at what
> r->filename is. Always look at r->uri. And don't do any stripping. People
> should know what their URLs are. So I can put "RewriteRule ^/abc/one
> /abc/two", and all will work as expected. 

I don't want to make per-dir RewriteRules this way, because this prevents us
some URL rewriting stuff to be used. I cannot explain it shortly, but have a
look at example of the killer application net.sw in the mod_rewrite
Documentation. If you have to allways match your per-dir prefix-URL then it
is much harder to make the RewriteRules working. I wrote the per-dir stuff
without stripping in the first stage, then tried to make a application like
net.sw and discovered that is horrible. Then I decided that it is better to
be able to think locally in per-dir RewriteRule and then I wrote the
stripping stuff. Hmmm.. and then applications lile net.sw can be set up in
with much less convoling rules.

BTW: There are a lot of people outtheir which downloaded mod_rewrite v2.0 and
emailed me that they are using the per-dir stuff. When I change it back to
no-stripping this would make a lot of people cry. So, it is better to search
for a internal solution and let the people think locally in per-dir
context...

>                                           Then all you need to do is say
> that RewriteRules in .htaccess files resolve to URLs instead of
> filenames, and everyone is happy. 

At the first point. But this is only practical if you have one-shot rules
(with [L] at the end). If you want to apply more then one rule in sequence,
you have the situation from above where you are forced to think globally.

> Or, even better, you could in fact make
> the rewritten target a filename. Just run sub_req_lookup_file on the name,
> and if it passes, pull out all the pertinent info (filename, args, finfo,
> content_*, handler), stick it into the request_rec, and return OK. That
> might work. Although it could also cause problems. 

Yes, I see the problems arriving, because per-dir walking is very late in the
Apache request process. Hmmmm... no, we shouldn't do such hacks.

> It might be better to
> stick with the internal redirect method.

Yes, as we discussed a few weeks ago, this really API compliant and the best
solution for per-dir rewrites.

> Regardless, I'm quite sure the current behavior is wrong.

Yes, to sum it up, there is one more problem with per-dir rewrites:

    ** The result of a per-dir rewrite is a valid filename and not a URL.  
    ** But the internal redirect needs a valid URL. Point!

Correct? Ok, now we should think about how we can make the transition back
from the filename to the URL before the internal redirect is called.

I think there is no programmable solution! Because in your example
above just before the internal redirect mod_rewrite has e.g.:

    r->uri       == "/xyz/one.cgi/some/path/info/stuff?any_query_info"

    r->filename  == "/abc/def/two.cgi"
    r->path_info == "/some/path/info/stuff"
    r->args      == "any_query_info"

And there is no way to now generate the URL

    /xyz/two.cgi/some/path/info/stuff?any_query_info

Isnt, it?

There is only one correct solution: a RewriteBase directive for .htaccess
files, which sets the Base-URL of this .htaccess file. In the above example
the /abc/def/.htaccess file would read:

    RewriteEngine On
    RewriteBase   /xyz
    RewriteRule   ^one.cgi(.*)  ^two.cgi$1

If this directive is missing, then the per-dir prefix is used, which then can
be wrong, i.e. if the per-dir prefix is not a valid URL-prefix the user has
to use RewriteBase! Point!

Is this ok?
                                        Ralf S. Engelschall    
                                        rse@engelschall.com
                                        http://www.engelschall.com/~rse

Mime
View raw message