httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Plüm, Rüdiger, VF-Group <ruediger.pl...@vodafone.com>
Subject Re: PR#41798 - mod_proxy URL mangling
Date Thu, 25 Oct 2007 10:02:21 GMT


> -----Ursprüngliche Nachricht-----
> Von: Nick Kew 
> Gesendet: Mittwoch, 24. Oktober 2007 03:07
> An: dev@httpd.apache.org
> Betreff: PR#41798 - mod_proxy URL mangling
> 
> 
> Some time ago, I posted a draft fix for PR#41798:
> http://www.mail-archive.com/dev@httpd.apache.org/msg37836.html
> 
> It attracted some comments here, and needed further work.
> 
> I've written and test-driven a slightly more sophisticated patch:
> 
> * ProxyPass directive accepts an optional "nocanon" keyword,
>   that tells us not to canonicalise the URL.  Without "nocanon",
>   behaviour is unchanged.
> 
> * proxy_trans checks nocanon.  If set, it constructs r->filename
>   from r->unparsed_uri.  To deal with Rudiger's objections to
>   my previous patch, it does so only after matching the ProxyPass
>   to r->uri.  If there's a mismatch between the two, indicating
>   path cleaning, it issues a 301 redirect, as indicated by Roy
>   for when we change a URL.
> 
> * If nocanon is set, then HTTP canonicalisation skips
>   ap_proxy_canonenc.  This is in line with the forward-proxy
>   fix to PR#42592.
> 
> New patch attached, soliciting review.


Index: modules/proxy/mod_proxy.c
===================================================================
--- modules/proxy/mod_proxy.c	(revision 587155)
+++ modules/proxy/mod_proxy.c	(working copy)
@@ -537,27 +540,36 @@
                 if ((real[0] == '!') && (real[1] == '\0')) {
                     return DECLINED;
                 }
-                found = ap_pregsub(r->pool, real, r->uri, AP_MAX_REG_MATCH,
-                                   regm);
-                /* Note: The strcmp() below catches cases where there
-                 * was no regex substitution. This is so cases like:
-                 *
-                 *    ProxyPassMatch \.gif balancer://foo
-                 *
-                 * will work "as expected". The upshot is that the 2
-                 * directives below act the exact same way (ie: $1 is implied):
-                 *
-                 *    ProxyPassMatch ^(/.*\.gif)$ balancer://foo
-                 *    ProxyPassMatch ^(/.*\.gif)$ balancer://foo$1
-                 *
-                 * which may be confusing.
-                 */
-                if (found && strcmp(found, real)) {
-                    found = apr_pstrcat(r->pool, "proxy:", found, NULL);
+                /* test that we haven't reduced the URI */
+                if (ent[i].nocanon
+                    && ap_regexec(ent[i].regex, r->unparsed_uri,
+                                  AP_MAX_REG_MATCH, reg1, 0)) {
+                    mismatch = 1;
                 }
                 else {
-                    found = apr_pstrcat(r->pool, "proxy:", real, r->uri,
-                                        NULL);
+                    found = ap_pregsub(r->pool, real, use_uri,
+                                       AP_MAX_REG_MATCH,
+                                       use_uri == r->uri ? regm : reg1);

Why not  ent[i].nocanon ? reg1 : regm  ??

+                    /* Note: The strcmp() below catches cases where there
+                     * was no regex substitution. This is so cases like:
+                     *
+                     *    ProxyPassMatch \.gif balancer://foo
+                     *
+                     * will work "as expected". The upshot is that the 2
+                     * directives below act the exact same way (ie: $1 is implied):
+                     *
+                     *    ProxyPassMatch ^(/.*\.gif)$ balancer://foo
+                     *    ProxyPassMatch ^(/.*\.gif)$ balancer://foo$1
+                     *
+                     * which may be confusing.
+                     */
+                    if (found && strcmp(found, real)) {
+                        found = apr_pstrcat(r->pool, "proxy:", found, NULL);
+                    }
+                    else {
+                        found = apr_pstrcat(r->pool, "proxy:", real,
+                                            use_uri, NULL);
+                    }
                 }
             }
         }
@@ -568,16 +580,37 @@
                 if ((real[0] == '!') && (real[1] == '\0')) {
                     return DECLINED;
                 }
-
-                found = apr_pstrcat(r->pool, "proxy:", real,
-                                    r->uri + len, NULL);
-
+                if (ent[i].nocanon
+                    && len != alias_match(r->unparsed_uri, ent[i].fake)) {
+                    mismatch = 1;
+                }
+                else {
+                    found = apr_pstrcat(r->pool, "proxy:", real,
+                                        use_uri + len, NULL);
+                }
             }
         }
+        if (mismatch) {
+            /* We made a reducing transformation, so we can't safely use
+             * unparsed_uri.  Send a redirect to the canonicalised URI
+             * instead.  FIXME: this breaks if protocol isn't http.
+             */
+            use_uri = apr_pstrcat(r->pool, "http://",
+                          r->hostname?r->hostname:r->server->server_hostname,
+                          ":", r->parsed_uri.port?r->parsed_uri.port_str:"80",
+                          r->uri, NULL);

I guess 

use_uri = ap_construct_url(r->pool, r->uri, r);

could help to address your FIXME.

Index: modules/proxy/mod_proxy.h
===================================================================
--- modules/proxy/mod_proxy.h	(revision 587155)
+++ modules/proxy/mod_proxy.h	(working copy)
@@ -113,6 +113,7 @@
     const char  *real;
     const char  *fake;
     ap_regex_t  *regex;
+    int nocanon;
 };
 
 struct dirconn_entry {

I am missing a minor bump since mod_proxy.h is public.

Otherwise looks good from first glance.

Regards

Rüdiger



Mime
View raw message