httpd-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Kew <>
Subject Re: Broken URI-unescaping in mod_proxy
Date Sun, 07 Oct 2007 22:50:59 GMT
On Thu, 13 Sep 2007 08:47:13 -0700
"Roy T. Fielding" <> wrote:

>    Proxies are absolutely
> forbidden from making any change to the URI -- they must forward
> as is or return an error.

This is at the root of PR41798, and the others I've marked as
duplicates of it.

In fact, it seems to be simpler to fix than I realised.
Despite standard URL manipulation, the URL is correct at the
point where it's passed to proxy_http_canon (+clones like
proxy_balancer_canon).  It is specifically ap_proxy_canonenc
that corrupt URLs containing escaped characters.

The bug is fixed if we just remove ap_proxy_canonenc!

Looking more closely at ap_proxy_canonenc, it is indeed
just plain wrong at this point:

 * Convert a URL-encoded string to canonical form.
 * It decodes characters which need not be encoded,
 * and encodes those which must be encoded, and does not touch
 * those which must not be touched.

The first clause (decodes characters which need not be encoded)
is the culprit here, directly responsible for the bug.
Re-encoding characters that must be encoded is AFAICT superfluous:
if the URL contains disallowed bytes at this point due to a
bug in our earlier processing, we should reject it with 400
rather than change it.

Given my history of fluffing up late night patches, I'll leave
this for now.  But if noone shouts, I'll replace ap_proxy_canonenc
with a simple validity check in the morning.

GET http://redirector/redirect-to/
Proxy correctly preserves %2F, but rewrites %3A to a colon.
Likewise it'll incorrectly unescape ? and &.  This breaks
things like yahoo redirector.

Nick Kew

Application Development with Apache - the Apache Modules Book

View raw message