httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Günther Gsenger <guenther.gsen...@gmail.com>
Subject [PATCH] mod_rewrite: re-encoding URLs for use as query-string arguments
Date Fri, 25 May 2007 10:01:04 GMT
There are at least 2 bug reports about the behaviour of mod_rewrite on  
unescaping URLs and then passing the unescaped references in the rewrite  
target (bug 34602 and 32328 deal about this in httpd-2 and I recall some  
other bug report for 1.3).
As far as I have found out the problem really is that the httpd unescapes  
the URLs before passing it to the mapper modules.
E.g. the rule
RewriteRule ^/(.*)$ /index.php?title=$1 [L]
rewrites URLs like /Foo to /index.php?title=Foo but as soon as there is a  
special escaped char in the URL, like an escaped hash, plus or ampersand  
the result is not as intended: /Foo%2BBar (/Foo+Bar urlencoded) is  
rewritten to /index.php?title=Foo Bar (instead of  
/index.php?title=Foo+Bar) and even worse /Foo%23Bar (/Foo#Bar urlencoded)  
is rewritten to /index.php?title=Foo#Bar (instead of  
/index.php?title=Foo%23Bar) so that the parts after the hash get totally  
ignored.

I know that there are workarounds to this problem by using the untouched  
%{THE_REQUEST} variable in a rewrite condition or inspecting these in the  
script that gets executed (like wikimedia does) but these are suboptimal.
I have written a patch that tries to address this problem. To remain  
backwards-compatible I did not change the original rewrite-behaviour but  
instead added a new flag to indicate that backreferences should get  
escaped.
Adding the flag [B] or [backrefescaping] to a RewriteRule makes  
mod_rewrite escape the backreferences in the rewrite target, e.g.
RewriteRule ^/(.*)$ /index.php?title=$1 [L,B]
Forces that when constructing the rewrite target the backreferenced parts  
get re-encoded.

The patch can be found here:  
http://issues.apache.org/bugzilla/attachment.cgi?id=20217
Note that it is against 2.2.4 because I couldn't get the SVN version to  
work.

Here is the patch for the doc (against SVN HEAD):
--- httpd/docs/manual/mod/mod_rewrite.xml.orig	2007-05-18  
19:28:17.796875000 +0200
+++ httpd/docs/manual/mod/mod_rewrite.xml	2007-05-18 19:18:25.078125000  
+0200
@@ -1176,6 +1176,19 @@
        following flags: </p>

        <ul>
+		<li>'<strong><code>backrefescaping|B</code></strong>'
+		Escapes the backreferences in the substitution string for
+		use as query string arguments.
+<example>
+RewriteRule ^(.*)$	index.php?show=$1	[B,L]
+</example>
+		If you do not use this flag, escaping of the URL will be done
+		before the backreference is placed. This will not work if the initial
+		URL contains any special characters that need escaping.
+		In the given example, loading the URL http://example.com/C++ would
+		do an internal redirect to index.php?show=C%2B%2B instead of
+		index.php?show=C++ (which would possibly not give the result intended).
+		</li>
          <li>'<strong><code>chain|C</code></strong>'
          (<strong>c</strong>hained with next rule)<br />
           This flag chains the current rule with the next rule

-- 
Günther

Mime
View raw message