From Wed Nov 14 13:49:06 2012 Return-Path: X-Original-To: Delivered-To: Received: from ( []) by (Postfix) with SMTP id CE37CD4C4 for ; Wed, 14 Nov 2012 13:49:06 +0000 (UTC) Received: (qmail 75445 invoked by uid 500); 14 Nov 2012 13:49:05 -0000 Delivered-To: Received: (qmail 75049 invoked by uid 500); 14 Nov 2012 13:48:59 -0000 Mailing-List: contact; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list Received: (qmail 75017 invoked by uid 99); 14 Nov 2012 13:48:58 -0000 Received: from (HELO ( by (qpsmtpd/0.29) with ESMTP; Wed, 14 Nov 2012 13:48:58 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=5.0 tests=SPF_PASS X-Spam-Check-By: Received-SPF: pass ( domain of designates as permitted sender) Received: from [] (HELO ( by (qpsmtpd/0.29) with ESMTP; Wed, 14 Nov 2012 13:48:50 +0000 Received: from [] ( []) (Authenticated sender: by (Postfix) with ESMTPA id CA0513C0B6E for ; Wed, 14 Nov 2012 14:49:01 +0100 (CET) Message-ID: <> Date: Wed, 14 Nov 2012 14:48:25 +0100 From: =?UTF-8?B?QW5kcsOpIFdhcm5pZXI=?= Reply-To: mod_perl list User-Agent: Thunderbird (Windows/20090812) MIME-Version: 1.0 To: mod_perl list Subject: Re: question on sub-requests References: <> <> <> <> In-Reply-To: <> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on Torsten Förtsch wrote: > On 11/13/2012 07:17 PM, André Warnier wrote: >> I didn't want to take too much time of anyone before, which is why I >> somewhat oversimplified the issue. But considering the traffic on the >> lis os low, maybe you want to hear the whole story after all. >> >> The basic case is this : a bit aside from our usual professional >> activities and for a friend, we run a website which is basically a shop >> with hundreds of individual items which people can view and buy. (I >> will provide the URL privately to anyone who is more interested. It's a >> cute shop.) >> >> The pages corresponding to these individual items, at the moment, are >> individual static pages in multiple sub-directories, and there are quite >> a lot of them. The friend creates and maintains these static pages >> herself; she is an artist rather than a programmer, so she can handle an >> html editor (which she does rather well) to edit static pages and test >> them on her PC before copying them to the server, but we cannot ask her >> to handle any kind of "template" pages or the like. >> Add to this that the basic logic of the website and the design and >> techniques used date back from some 10 years ago, have been patched and >> repatched several times over several years, and are rather bad. >> >> Now there is a requirement that, instead of being just static pages, >> each of these pages should in addition contain a
with some >> specific item-related information, allowing to buy these items on-line >> (so it cannot be done just with an include or a stylesheet). > > Have you considered to patch the files automatically? CPAN has a bunch > of modules that can parse also quite bad/old HTML (with missing

or > and the stuff). > > You could for example keep the original files in one directory tree and > set up a process that waits for changes there either by scanning it on a > regular basis or even by using inotify (there is a cron-like daemon that > is made just for this kind of monitoring. I think it's called incron or > so. Anyway, there is plenty of support for inotify on CPAN.) > > Now, if anything changes the daemon can start a program that parses the > HTML, inserts the form and writes the result to another directory which > is your DocumentRoot. On the way it could also call 'git commit'. And > now you have a simple content management system almost for free. > >> What I am trying to achieve, without having to edit each of these >> individual hundreds of pages, or changing the links to these pages, or >> change the basic design of the application (because there is no budget >> for that), is to find a clever way on the server side >> to respond to a normal "/Shop_xxx/def/xyz.html" URL of one of these >> pages, to combine into one response both the required , and the >> content of the existing unmodified static page. I also do not want to >> parse the html on-the-fly and insert a right into it, because the >> html that she creates with her (T-online) editor is rather bad to start >> with, and I have no guarantee that the result would be pleasing. >> >> So that was the reason to think of the solution, whereby in >> response to the initial request for "/Shop_xxx/def/xyz.html", I would >> respond with a first containing the form (generated by a >> back-end application, and depending on the item), and a second >> containing her artfully-crafted static page describing that item in all >> its glory. >> >> The static pages in question are in several subdirectories of >> DocumentRoot, and at different levels. Fortunately, all the top >> sub-directories names start with "/Shop_" (after which there can be >> "Quilts" or "Babydecken" and things like that, and a variable hierarchy >> of sub-directories containing html files and jpg images and the like. >> >> So I have this configured : >> >> sethandler modperl >> PerlResponseHandler My::ShopResponse >> ... >> >> >> As a result in part of the previous communications on this list, this >> PerlResponseHandler >> does more or less what I want, except one remaining problem which I am >> trying to resolve right now : >> In response to an initial request for "/Shop_xxx/def/xyz.html", the >> handler generates a >> document as such : >> >> >> >> (1) >> (2) >> >> >> >> (1) for the dynamically-generated html document >> (2) for the static existing page >> >> Because the second frame's URI also starts with "/Shop_", when the >> browser requests this frame, the same ResponseHandler is called. >> The handler examines the URL and sees that it ends in ".shop" (instead >> of ".html"). >> So it knows that this time, it should not send another frameset, but >> instead it should strip the trailing ".shop" and deliver, as is, the >> content of the static document "/Shop_xxx/def/xyz.html". >> >> But, how do I tell it to do that ? >> I have tried : >> my $uri = $r->uri(); >> if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) { >> $uri =~ s/\.shop$//; # strip the trailing ".shop" >> $r->internal_redirect($uri); >> return Apache2::Const::OK ; >> } >> >> and also : >> >> my $uri = $r->uri(); >> if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) { >> $uri =~ s/\.shop$//; # strip the trailing ".shop" >> my $subr = $r->lookup_uri($uri); >> $subr->run(); >> return Apache2::Const::OK ; >> } >> >> but both of those result in a loop : they end up requesting >> "/Shop_xxx/def/xyz.html", which hits the same Location, which runs the >> same handler, which then produces the frameset, and so on. > > Here is how the cycle goes: > > 1) The server gets a request for /Shop_xxx/def/xyz.html, skips the > if-branch and generates the frameset. > > 2) The bottom frame generates another request for > /Shop_xxx/def/ It enters the if-branch and issues a > subrequest or an internal redirect for /Shop_xxx/def/xyz.html. > > 3) The subeq/redir enters the handler again. Now it avoids the if-branch > because it does not match /\.shop$/. So it spits out the frameset. > > 4) goto 2) by means of the browser > > How to break the loop? Instead (or inside) of the LocationMatch above use: > > PerlFixupHandler "sub { \ > use Apache2::RequestUtil (); \ > use Apache2::RequestRec (); \ > use Apache2::Const -compile=>qw/DECLINE/; \ > my ($r)=@_; \ > if( $r->is_initial_req ) { \ > $r->handler('modperl'); \ > $r->set_handlers(PerlResponseHandler=>'My::ShopResponse'); \ > } \ > return Apache2::Const::DECLINED; \ > }" > > Then step 3) reads: > > 3) The subreq/redir enters the request cycle again. The if-branch of > fixup handler is skipped because the request is not initial. Hence, the > PerlResponseHandler is skipped completely and the default handler sends > the document. > > You can achieve a similar effect with mod_rewrite. It has IS_SUBREQ > available in RewriteCond. But I don't know if that also checks for > !$r->prev. > > You can also try to modify your Response handler to decline if > !$r->is_initial_req: > > my $uri = $r->uri(); > if ($uri =~ s!(/[^/]+\.html?)\.shop$!$1!i) { > $r->internal_redirect($uri); > return Apache2::Const::OK ; > } > return Apache2::Const::DECLINE unless $r->is_initial_req; > > I am not sure if that works. I think I have never tried to return > DECLINED from a response handler. > >> So how do I tell Apache/mod_perl that this time "I mean it", and that it >> should directly deliver the requested file, without re-running the whole >> cycle ? >> >> I can of course request the corresponding filename() and deliver it >> myself (perhaps with sendfile()), but that does not seem to be the most >> elegant way of doing this. Or is it ? >> >> Oh, and I'd like it elegant, but I would prefer not having to introduce >> a PerlFixupHandler or a PerlOutputFilter or Javascript, and do it all >> within this ResponseHandler. That's because I have colleagues who know >> even less about mod_perl than I do, and I'd like to leave them something >> simple to deal with in support and maintenance for another 10 years. >> >> ... >> >> Aside : I just tried >> >> my $uri = $r->uri(); >> if ($uri =~ m!\/([^/]+\.htm[l]?\.shop)$!i) { >> $uri =~ s/\.shop$//; # strip the trailing ".shop" >> my $subr = $r->lookup_uri($uri); >> $r->sendfile($subr->filename()); >> return Apache2::Const::OK ; >> } >> >> and that works. So I guess that /is/ the right solution here. > > No, it's not. > > The default handler does a bit more than just send $r->filename. For you > it may work but it won't in the general case. See default_handler() in > server/core.c. > > Torsten > > Hi Torsten. Thanks for all your insightful and informative answers. I my particular case, the sendfile() solution above works fine, because I am sure in this case that no additional processing needs to take place on those static pages. But (even without checking the source of default_handler()) I understand that there may be cases where it wouldn't, and appreciate your tips for the future (and see below). About returning DECLINED in a ResponseHandler : that works fine, and apparently has the effect of Apache serving the resource using it's default handler (or, I suppose, any other alternative response handler that may be chained). This is consistent with the fact that a response handler is of type quote RUN_FIRST Handlers of the type RUN_FIRST will be executed in the order they have been registered until the first handler that returns something other than Apache2::Const::DECLINED. If the return value is Apache2::Const::DECLINED, the next handler in the chain will be run. If the return value is Apache2::Const::OK the next phase will start. In all other cases the execution will be aborted. unquote As to the "initial vs subrequest" test : I have replaced my code section by yours above : if ($uri =~ s!(/[^/]+\.html?)\.shop$!$1!i) { $r->internal_redirect($uri); return Apache2::Const::OK ; } return Apache2::Const::DECLINED unless $r->is_initial_req; and it is my pleasure to confirm that it works, and does avoid the "fractal" loop (and it is also more elegant than mine). About the option of pre-processing the html files : I had thought of that before too, and in fact the main reason for introducing this ResponseHandler was to replace a previous procedure that was doing something similar (processing all the html files coming from that less-than-optimal html editor, with a "cleanup" perl script). But I ended up having to process so many "special cases" that in the end I thought it was just easier to deliver the static pages "as is", and let our friend correct them manually if they don't look exactly the way she wants. Sometimes, less is more. What should really happen with that website is a complete redesign, using the more modern techniques we're using nowadays with our other websites (TT2 e.g.). But there is no budget for that, so this nth patch will have to do for now. Thanks to all anyway. This list is not very chatty, but always very helpful. And I am a real mod_perl fan. It is an incredibly powerful tool.