Return-Path: Delivered-To: apmail-httpd-modules-dev-archive@minotaur.apache.org Received: (qmail 28641 invoked from network); 13 Mar 2011 01:16:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 13 Mar 2011 01:16:02 -0000 Received: (qmail 52901 invoked by uid 500); 13 Mar 2011 01:16:02 -0000 Delivered-To: apmail-httpd-modules-dev-archive@httpd.apache.org Received: (qmail 52873 invoked by uid 500); 13 Mar 2011 01:16:02 -0000 Mailing-List: contact modules-dev-help@httpd.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: modules-dev@httpd.apache.org Delivered-To: mailing list modules-dev@httpd.apache.org Received: (qmail 52865 invoked by uid 99); 13 Mar 2011 01:16:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 01:16:02 +0000 X-ASF-Spam-Status: No, hits=-0.1 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_MED,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jmarantz@google.com designates 74.125.121.67 as permitted sender) Received: from [74.125.121.67] (HELO smtp-out.google.com) (74.125.121.67) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 13 Mar 2011 01:15:55 +0000 Received: from kpbe16.cbf.corp.google.com (kpbe16.cbf.corp.google.com [172.25.105.80]) by smtp-out.google.com with ESMTP id p2D1FXXQ009846 for ; Sat, 12 Mar 2011 17:15:34 -0800 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=google.com; s=beta; t=1299978934; bh=Lx5FSHJiE33fM05+E5ZBJXZx4IA=; h=MIME-Version:From:Date:Message-ID:Subject:To:Content-Type; b=k81bghkf22Rf166TeCLq+iCMUWnuU0z0FH/mil6ri9NPI/4+UZ9nJFPSWrDGEXFv4 tRUUhlLbQd9qctLgOHl0g== Received: from iyj12 (iyj12.prod.google.com [10.241.51.76]) by kpbe16.cbf.corp.google.com with ESMTP id p2D1FWhO020239 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Sat, 12 Mar 2011 17:15:32 -0800 Received: by iyj12 with SMTP id 12so4541938iyj.41 for ; Sat, 12 Mar 2011 17:15:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=beta; h=domainkey-signature:mime-version:from:date:message-id:subject:to :content-type; bh=ObH8ItQFSNXE2mn1ZM49iPGY31Ba99CgWpNzX65wBls=; b=xenD0MnAdcerbymfGPIJ7BfYTd6+ckJiiiKeXdrxCAQ6+x5qpcJhwvdL5lt49ejZTU dY2QeV9r4n4191RFP7SA== DomainKey-Signature: a=rsa-sha1; c=nofws; d=google.com; s=beta; h=mime-version:from:date:message-id:subject:to:content-type; b=LeWni5cAQYxKQgulqpSEh+JAGZVjAyftw/owikn/RUwjFLa+iCF+dRH3jDorh7G/gR oXk62IZey6E4rtNkbeXA== Received: by 10.231.115.209 with SMTP id j17mr8595708ibq.19.1299978932126; Sat, 12 Mar 2011 17:15:32 -0800 (PST) MIME-Version: 1.0 Received: by 10.231.33.65 with HTTP; Sat, 12 Mar 2011 17:15:12 -0800 (PST) From: Joshua Marantz Date: Sat, 12 Mar 2011 20:15:12 -0500 Message-ID: Subject: Saving the original request URI ahead of a mod_rewrite To: modules-dev@httpd.apache.org, Naomi Content-Type: multipart/alternative; boundary=001636920ba777051e049e52ee4a X-System-Of-Record: true --001636920ba777051e049e52ee4a Content-Type: text/plain; charset=ISO-8859-1 Hi, A new bug has surfaced in mod_pagespeed that we understand, but would welcome advice on the best way to fix. The problem is tracked in http://code.google.com/p/modpagespeed/issues/detail?id=234. Briefly, mod_pagespeedseeks to improve the performance of web sites by rewriting them while being served from Apache. mod_pagespeed transforms the HTML text in an output filter. To do this correctly, mod_pagespeed needs to know what URL that browser thinks it has when it is displaying a site. The failure scenario is when a site has a RewriteRule in an .htaaccess file. The request->unparsed_uri gets rewritten by mod_rewrite, so by the time mod_pagespeed runs it has the wrong idea of where the page is. This makes mod_pagespeed resolve relative URLs embedded in the HTML in a manner inconsistent from the browser. We thought we had a solution to this problem by putting in an early-running hook that saves the original request->unparsed_uri in request->notes. That seems to work in some cases, but, we've found, not when the RewriteRule is in an .htaccess file. In that case, mod_rewrite triggers an "internal redirect", which causes an entirely new 'request' to be allocated, which does *not* have a copy of the original request->notes. It does make a hacked version of request->subprocess_env however, prepending "REDIRECT_" to each key. It also seems that the new request has a pointer to the original request (which has the note) in request->prev. But this new request, without the notes, is the one that's passed to mod_pagespeed's output filter, with request->unparsed_uri pointing to the rewritten URL, which is not consistent with the browser's URL bar. So I'm writing to this group to get suggestions on the most robust way to fix this. Here are some ideas: 1. Add an early 'create_request' hook and use that to copy the 'notes' that we care about from request->prev. 2. Change from storing notes in request->notes to request->subprocess_env. When we go to do the lookup with our key, we can look up REDIRECT_key if a note with our original key is not found. This strikes me as a brittle hack. 3. Follow the request->prev chain when looking up notes. This strikes me as risky because I have no idea what happens to the ->prev chain through all the modules in the Apache eco-system, or how far down the chain I might have to go. So I like #1 best. Any other opinions or ideas? -Josh --001636920ba777051e049e52ee4a--