Return-Path: Delivered-To: apmail-httpd-modules-dev-archive@minotaur.apache.org Received: (qmail 43342 invoked from network); 3 Jan 2011 21:08:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Jan 2011 21:08:11 -0000 Received: (qmail 81825 invoked by uid 500); 3 Jan 2011 21:08:11 -0000 Delivered-To: apmail-httpd-modules-dev-archive@httpd.apache.org Received: (qmail 81724 invoked by uid 500); 3 Jan 2011 21:08:10 -0000 Mailing-List: contact modules-dev-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: modules-dev@httpd.apache.org Delivered-To: mailing list modules-dev@httpd.apache.org Received: (qmail 81716 invoked by uid 99); 3 Jan 2011 21:08:10 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 21:08:10 +0000 X-ASF-Spam-Status: No, hits=1.5 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of jmarantz@google.com designates 216.239.44.51 as permitted sender) Received: from [216.239.44.51] (HELO smtp-out.google.com) (216.239.44.51) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 03 Jan 2011 21:08:04 +0000 Received: from hpaq12.eem.corp.google.com (hpaq12.eem.corp.google.com [172.25.149.12]) by smtp-out.google.com with ESMTP id p03L7gwJ013693 for ; Mon, 3 Jan 2011 13:07:42 -0800 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1294088863; bh=pCgVq7QEIV7SpXvTf6k4NH0kQ5c=; h=MIME-Version:In-Reply-To:References:From:Date:Message-ID:Subject: To:Content-Type; b=JQ7Eo3h5Td7Er4MuFlLNjsmGs6W3HxuVxzZ+lwI/7Jo2l2YMcwHuo+P39BlVsfrzt dUB0b6xLW8TN9Uc4Wcblw== Received: from iyi12 (iyi12.prod.google.com [10.241.51.12]) by hpaq12.eem.corp.google.com with ESMTP id p03L7eoj024953 for ; Mon, 3 Jan 2011 13:07:41 -0800 Received: by iyi12 with SMTP id 12so11908380iyi.39 for ; Mon, 03 Jan 2011 13:07:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:received:mime-version:received:in-reply-to :references:from:date:message-id:subject:to:content-type; bh=OO8POYKmmEE2Gjzx8rm8wWvtZ8ksekBMqZPvsG0I8Ms=; b=VvJ/bh53WkeGo2fo8ZcgvZDHSztrPPxQ50JQxtBJmJ7J7GW1+eEml5rmhAt09pt7n/ wkcWhwSypEfIIfcFGybg== DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; b=DImd8MxU1Ci2YlNiLkb4LTHeHdqSjKEM4dcZLhHdnSlTt7h1ykBdhjQihbzXWz454R Gm8f06iJUDlx7n+0g94w== Received: by 10.231.31.6 with SMTP id w6mr4795519ibc.160.1294088860053; Mon, 03 Jan 2011 13:07:40 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.40.11 with HTTP; Mon, 3 Jan 2011 13:07:19 -0800 (PST) In-Reply-To: References: From: Joshua Marantz Date: Mon, 3 Jan 2011 16:07:19 -0500 Message-ID: Subject: Re: Overriding mod_rewrite from another module To: modules-dev@httpd.apache.org Content-Type: multipart/alternative; boundary=00221532c8f0cfb3370498f78a90 X-System-Of-Record: true X-Virus-Checked: Checked by ClamAV on apache.org --00221532c8f0cfb3370498f78a90 Content-Type: text/plain; charset=ISO-8859-1 I answered my own question by implementing it and failing. You can't bypass mod_authz_host because it gets invoked via the magic macro: AP_IMPLEMENT_HOOK_RUN_ALL(int,access_checker, (request_rec *r), (r), OK, DECLINED) This means that returning OK from my handler does not prevent mod_authz_host's handler from being called. I came up with a simpler idea that does not require depending on string-literals in mod_rewrite.c. I still add a translate_name hook to run prior to mod_rewrite, but I don't try to prevent mod_rewrite from corrupting my URL. Instead I just squirrel away the uncorrupted URL in my own entry in request->notes so that I can use that rather than request->unparsed_uri downstream when processing the request. This seems to work well. The only drawback is if the site admin adds a mod_rewrite rule that mutates mod_pagespeed's resource name into something that does not pass authentication, then mod_authz_host will reject the request before I can process it. This seems like a reasonable tradeoff as that configuration would likely be borked in other ways besides mod_pagespeed resources. Commentary would be welcome. -Josh On Mon, Jan 3, 2011 at 1:10 PM, Joshua Marantz wrote: > I have implemented Ben's hack in mod_pagespeed in > http://code.google.com/p/modpagespeed/source/detail?r=345 . It works > great. But I am concerned that a subtle change to mod_rewrite.c will break > this hack silently. We would catch it in our regression tests, but the > large number of Apache users that have downloaded mod_pagespeed do not > generally run our regression tests. > > I have another idea for a solution that I'd like to see opinions on. > Looking at Nick Kew's book, it seems like I could set request->filename to > whatever I wanted, return OK, but then also shunt off access_checker for my > rewritten resources. The access checking on mod_pagespeed resources is > redundant, because the resource will either be served from cache (in which > case it had to be authenticated to get into the cache in the first place) or > will be decoded and the original resource(s) fetched from the same server > with full authentication. > > I'd appreciate any comments on this approach. > > -Josh > > > On Mon, Jan 3, 2011 at 11:40 AM, Joshua Marantz wrote: > >> OK I tried to find a more robust alternative but could not. I was >> thinking I could duplicate whatever mod_rewrite was doing to set the request >> filename that appears to be complex and probably no less brittle. >> >> I have another query on this. In reality we do *not* want our rewritten >> resources to be associated with a filename at all. Apache should never look >> for such things in the file system under ../htdocs -- they will not be >> there. We also do not need it to validate or authenticate on these static >> resources. >> >> In particular, we have found that there is some path through Apache that >> imposes what looks like a file-system-based limitation on URL segments (e.g. >> around 256 bytes). This limitation is inconvenient and, as far as I can >> tell, superfluous. URL limits imposed by proxies and browsers are more like >> 2k bytes, which would allow us to encode more metadata in URLs (e.g. >> sprites). Is there some magic setting we could put into the request >> structure to tell Apache not to interpret the request as being mapped from a >> file, but just to pass it through to our handler? >> >> Thanks! >> -Josh >> >> On Sat, Jan 1, 2011 at 6:24 AM, Ben Noordhuis wrote: >> >>> On Sat, Jan 1, 2011 at 00:16, Joshua Marantz >>> wrote: >>> > Thanks for the quick response and the promising idea for a hack. >>> Looking at >>> > mod_rewrite.c this does indeed look a lot more surgical, if, perhaps, >>> > fragile, as mod_rewrite.c doesn't expose that string-constant in any >>> formal >>> > interface (even as a #define in a .h). Nevertheless the solution is >>> > easy-to-implement and easy-to-test, so...thanks! >>> >>> You're welcome, Joshua. :) >>> >>> You could try persuading a core committer to add this as a >>> (semi-)official extension. Nick Kew reads this list, Paul Querna often >>> idles in #node.js at freenode.net. >>> >>> > I'm also still wondering if there's a good source of official >>> documentation >>> > for the detailed semantics of interfaces like ap_hook_translate_name. >>> > Neither a Google Search, a stackoverflow.com search, nor the Apache >>> > Modules< >>> http://www.amazon.com/Apache-Modules-Book-Application-Development/dp/0132409674/ref=sr_1_1?ie=UTF8&qid=1293837117&sr=8-1 >>> >book >>> > offer much detail. >>> > code.google.com fares a little better but just points to 4 existing >>> usages. >>> >>> This question comes up often. In my experience the online >>> documentation is almost always outdated, incomplete or outright wrong. >>> I don't bother looking things up, I go straight to the source. >>> >>> It's a kind of job security, I suppose. There are only a handful of >>> people that truly and deeply understand Apache. We can ask any hourly >>> rate we want! >>> >> >> > --00221532c8f0cfb3370498f78a90--